Ethics of Machine Learning – Uncertainty

I saw something this morning that highlights one of the problems with machine learning systems and the limits of their capabilities. It goes to an extreme but it aptly displays the point.

https://www.nytimes.com/interactive/2024/07/18/technology/spain-domestic-violence-viogen-algorithm.html

The story highlights something important about ML/AI, it can be systemically incorrect if there were material omissions from the training data. This may not be the point of the article but it’s still true. Unseen aspects of the data can also appear in the outputs, whether intended or not. The issues with the model in the article could be as simple as survivorship bias. If there was no training data that came from unreported crimes, it is challenging to evaluate how effective the model can be without testing it in vivo and generating more data which can then help re-train (read: improve) the model.

This does lead to improvement, however, and there’s still value in creating these approximations and determining ways to keep feeding new data points back into the model. Adding more parameters to allow more nuanced relationships between inputs can help drive the approximation close to 100%, but it will never be 100%. Computation takes time, and as soon as time passes, things have changed.

In the case of the algorithm in Spain, having it as a tool is helpful, however the continuous collection of data is necessary to ensure the model continues learning as time moves forward. It’s easy to focus on how we can better predict and prevent negative outcomes but if we’re missing the point, say, laws need to be tougher/enforced, cultural values have to evolve, the New York Times has to publish an article about domestic abuse in Spain to bring more attention to the matter to have meaningful change. If that’s the case, the AI did its job as the researchers that created the model intended. My guess is that the researchers just wanted to help victims, and they did that with their accurate predictions from the model they assembled. Was their goal to catch more than 50% of dangerous situations, 75%, 90%? It can’t get to 100% in an analog, continuous world. Physics struggles with the same concept, uncertainty. And that’s the biggest problem with machine learning and AI, you can’t get rid of uncertainty, ever. Physics hasn’t gotten around that one yet either.

This is a universal hazard of discretized systems. If you keep moving half-way to between your start and your destination, you’ll never get there, but you can always keep going.

Large rule sets in Snort on pfSense cause PHP memory crash

Ran into an issue in Snort on pfSense where the memory limit specified in /usr/local/pkg/snort/snort.inc is insufficient and the service will crash shortly after launch.

The part that needs to be increased is bolded.

<?php
/*
* snort.inc
*
* part of pfSense (https://www.pfsense.org)
* Copyright (c) 2006-2023 Rubicon Communications, LLC (Netgate)
* Copyright (c) 2009-2010 Robert Zelaya
* Copyright (c) 2013-2022 Bill Meeks
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the “License”);
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an “AS IS” BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

require_once(“pfsense-utils.inc”);
require_once(“config.inc”);
require_once(“functions.inc”);
require_once(“service-utils.inc”); // Need this to get RCFILEPREFIX definition
require_once(“pkg-utils.inc”);
require_once(“filter.inc”);
require_once(“xmlrpc_client.inc”);
require(“/usr/local/pkg/snort/snort_defs.inc”);

// Snort GUI needs some extra PHP memory space to manipulate large rules arrays
ini_set(“memory_limit”, “4096M”);

// Explicitly declare this as global so it works through function call includes
global $g, $rebuild_rules;

/* Rebuild Rules Flag — if “true”, rebuild enforcing rules and flowbit-rules files */
$rebuild_rules = false;

 

If that limit is too low, Snort will produce this error when it’s loading:

[25-Jan-2023 20:25:20 America/New_York] PHP Fatal error: Allowed memory size of 402653184 bytes exhausted (tried to allocate 12288 bytes) in /usr/local/pkg/snort/snort.inc on line 1093

The file is overwritten each time the pkg is updated so you have to make this change each time.

N.B. The install doesn’t complete due to memory exhaustion, you can prevent this by going into Snort and removing a character from your oinkcode. This will prevent the rule set from being downloaded and allow the install to complete since it’s the enumeration of rules that fills the memory.

Thoughts on a Monday

It might be time to find new hosting. The site is so much slower than it used to be.

 

Edit: Moments after posting, I get this. We’ll see if things improve.

Load more