Rebel Tech Newsletter: The Opaqueness of AI


May 23rd, 2023

The Rebel Tech Newsletter is our safe place to critique data and tech algorithms, processes and systems. We highlight a recent data article in the news and share resources to help you dig deeper in understand how our digital world operates. DataedX Group helps data educators, scholars and practitioners learn how to make responsible data connections. We help you source remedies and interventions based on the needs of your team or organization.


IN DATA NEWS

“Anna Rogers, an assistant professor at the IT University of Copenhagen, proposes that models that are not open and reasonably reproducible cannot be considered a requisite baseline. She also believes that relevance != popularity. The fact that every new model is classified as the Thing-Everybody-Is-Talking-About doesn't make it a yardstick for researchers to compare their work. Members of the research community can push back against this cycle, but it takes effort. The most important details are that closed models must disclose enough information about their architecture, training setup, data, and operations at inference time, and that researchers must not buy into hype.”

After reading this Medium post, CeCe Peniston’s song “Finally” pops into my head. Well, the song’s hook: “Finally it has happened to me right in front of my face / And I just cannot hide it”.

Finally. The AI and ML research community is becoming tired of its own predictable, uninspired and less valuable approaches to showing the relevance of the current “state-of-the-art” AI/ML/DL model. This research community feeds into the AI type cycle by conducting their work on whatever ‘The-Thing-Everybody-Is-Talking-About’ model du jour is. “[T]his kind of reasoning simply no longer applies to closed models that do not disclose enough information about their architecture, training setup, data, and operations happening at inference time.” And the AI research narrative basically produces 5 types of papers, as outlined below:

We propose a machine learning model that improves on the state-of-the-art We propose a new challenging task/benchmark/metric We show that model X does/doesn’t do Y: (model analysis and interpretability) We show that model X is (un)fair/biased etc We develop a more efficient solution than model X

As a previously frequent AI, ML and information retrieval conference paper reviewer, I saw this pattern before COVID times in what was considered innovative and novel – and I was not impressed. The closed system modeling and benchmarking are disingenuous and enables the research construction to remain a very opaque process. In practice, these models aren’t closed but influenced by historical, social, economic and political perspectives. If the release of GPT-4 taught us nothing else, it taught us that. It’ll be great if this article would be reading #1 in every AI research-oriented class. Aspiring to seasoned AI researchers need to have their assumptions constructively challenged, on a regular basis and more rigorously than through the one-sided peer review process.

So let me take this research monotony revelation to the next level. The AI research community should be pushing, not only for more accountability of its systems, tools and platforms, but also for more transparency of these models before they are released. The high degree of opaqueness of AI model design and implementation needs to be under much more scrutiny. We need a variant of the U.S.’s Centers for Diseases Control and Prevention for technology. Oh wait, I’ve already suggested that in Data Conscience, Chapter 9: “Center for Technological Civility (CTC), is a national technological institute of the United States. CTC’s main goal is to ‘protect the public health and safety through control and prevention’ of technological harm in the United States and internationally. The CTC would oversee the impact and ramifications of deepfakes, hate speech discourses and other tech civility efforts.”

Like what you're reading? Find it informative and insightful? You can sponsor the Rebel Tech Newsletter and follow on LinkedIn.


DATA CONSCIENCE CORNER

"The mechanical work associated with data consumes us. What we ignore is the impact that our actions with data is having and will have once data leaves our data/software team." pg 28 Data Conscience

Microsoft’s Chief Scientific Officer and ‘The Godfather of A.I.’ recently went public to warn the tech community and general public about AI harms. Each person emphasized their concerns over the technology, which only adds to the fear mongering and AI hype cycle we’ve been witnessing since the AI Pause open letter released on March 22, 2023. Their words and interviews didn’t address the race, gender, class, ableist and other -isms that AI is clearly doing. Their words didn’t touch on the massive amounts of data siphoned from the interwebs without consent or consequences. Their words are empty and devoid of data/AI accountability. We therefore need to pair every mechanical/computational stage in the data pipeline with an equity counterpart. For instance, during the data collection phase, we’ll perform the data cleaning and reformatting procedures (computationally-focused) along side an institutional review board, e.g., IRB, screening of the data collection process (equity-focused). Algorithmic bias and discrimination can be identified, cataloged and vetted for potential societal impacts. The days of separating the creator/data worker from their creation/data product are long gone.


A WORD FOR BLACK WOMEN IN DATA

In a world that tries to confine Black women, have the courage to shatter the mold and rewrite the rules. Embrace your unique voice, ideas, and talents, sis. Dare to challenge the status quo and bring your authentic self to every table. Don't let anyone dismiss or diminish your worth.

YOU are the architect of your destiny, and it's time to create the life you’ve always desired for yourself.

Embrace the rebel within you. Be unapologetically bold, audacious, and fearless in pursuing your dreams. Take ownership of your accomplishments (yes YOU, sis take up all of the space), your brilliance, and your unlimited potential. With every project, every presentation, and every data-driven insight, know that you have the power to transform industries, challenge biases, and pave the way for generations to come.

Remember, you're a force to be reckoned with. Rise up, and show the world that you can’t be contained. The stage is yours, and I’m cheering you on every step of the way.


Daily-ish rest routine suggestion: I watch a funny 30-minute comedy nightly. I've already re-watched The Golden Girls, Martin, and Living Single.

Black Women in Data Summit

September 23-24, 2023

ATL | Online

$499 In-Person | $129 Online

First, check out these hotel accommodation suggestions for those of you planning to attend in-person. Second fyi, there’s no room block but we’re months away from the Summit so you could get a great rate. And we’ve also updated the Summit’s schedule. The times have been shifted a little bit – September 23, 9AM-5:30PM EST and September 24, 10AM-2:45PM EST. Also, our speakers’ session titles and descriptions are now available. And lastly, reserve your spot as in-person tickets go up to $599 on Jun 20, 2023.


UPCOMING EVENTS

The Data Conscience Book Tour 2023 | Informatica Juneteenth Event

On June 8, I’ll be speaking with the Informatica team about the racial bias that exists within data operations and discussing how some of these inequities can be mitigated in data and AI applications! Bringing data ethics to life requires all of us, whether you consider yourself a data professional or not, to go beyond connecting the datapoints and seeing the people who those data points represent.

Follow us on social

twitterinstagram


LAUGHING IS GOOD FOR THE SOUL

Stay Rebel Techie,

Brandeis

Thanks for subscribing! If you like what you read or use it as a resource, please share the newsletter signup with three friends!

Brandeis Marshall - DataedX

Learn how to make more responsible data connections. I help educators, researchers and practitioners align data polices, practices and products for equity. Sign up for my Rebel Tech Newsletter!

Read more from Brandeis Marshall - DataedX

June 25, 2024 The Rebel Tech Newsletter is our safe place to critique data and tech algorithms, processes, and systems. We highlight a recent data article in the news and share resources to help you dig deeper in understand how our digital world operates. DataedX Group helps data educators, scholars and practitioners learn how to make responsible data connections. We help you source remedies and interventions based on the needs of your team or organization. IN DATA NEWS The impact of...

June 11, 2024 The Rebel Tech Newsletter is our safe place to critique data and tech algorithms, processes, and systems. We highlight a recent data article in the news and share resources to help you dig deeper in understand how our digital world operates. DataedX Group helps data educators, scholars and practitioners learn how to make responsible data connections. We help you source remedies and interventions based on the needs of your team or organization. IN DATA NEWS No Robots(.txt): How...

April 30, 2024 The Rebel Tech Newsletter is our safe place to critique data and tech algorithms, processes, and systems. We highlight a recent data article in the news and share resources to help you dig deeper in understand how our digital world operates. DataedX Group helps data educators, scholars and practitioners learn how to make responsible data connections. We help you source remedies and interventions based on the needs of your team or organization. IN DATA NEWS Introducing Devin,...