January 31st, 2022
The Rebel Tech Newsletter is our safe place to critique data and tech algorithms, processes and systems. We highlight a recent data article in the news and share resources to help you dig deeper in understand how our digital world operates. DataedX Group helps data educators, scholars and practitioners learn how to make responsible data connections. We help you source remedies and interventions based on the needs of your team or organization.
IN DATA NEWS
"Microsoft, its subsidiary GitHub, and its business partner OpenAI have been targeted in a proposed class action lawsuit alleging that the companies’ creation of AI-powered coding assistant GitHub Copilot relies on “software piracy on an unprecedented scale.”
The case is only in its earliest stages but could have a huge effect on the broader world of AI, where companies are making fortunes training software on copyright-protected data. Matthew Butterick said, “I think it’s really simple. AI systems are not magical black boxes that are exempt from the law, and the only way we’re going to have a responsible AI is if it’s fair and ethical for everyone. So the owners of these systems need to remain accountable.”
GitHub Copilot is thievery. PeriodT. Full Stop.
GitHub Copilot intentionally severs the relationship between creator and creation. This purposefully, yet painstaking act is accomplished by 1) stripping the humanity that's uniquely part of the creator to make the creation and then 2) monetizing the creation immediately. Expediting public use and adoption of tech helps solidify organizations ‘ reliance on the tech and embeds the tech as essential for business operations. The revenue and subsequent profit streams wind up multiplying exponentially.
In response to this backlash, GitHub did its own "research" to showcase the benefits of its thievery from the programmer's perspective: 74% say they focus on more satisfying tasks, 88% feel more productive and 96% are faster with repetitive tasks. There's no third-party independent vetting of this "research" so imho, it's a public relations blitz. When you dig deeper on their evaluation, the GitHub Copilot adoption is alarming -- "users accepted on average 26% of all completions shown by GitHub Copilot. We also found that on average more than 27% of developers’ code files were generated by GitHub Copilot, and in certain languages like Python that goes up to 40%."
So let's translate this more concretely: at least 1 in 4 pieces of code is stolen, of those who use GitHub Copilot. Therefore, the code they're now creating is mediocre and equates to solid C letter grade (not grade A work), if the code was done entirely by the programmer & software libraries. The algorithmic-based innovations to be produced, if/when GitHub Copilot is used, will likely be this bland, mediocre variant. And more importantly, any algorithmic bias and oppression tactics embedded in the codebase will be even further entrenched, disseminated and deployed unchecked. Legally, there's no telling on how the law will handle this case. What's more certain is that the affect on humanity will be deep, wide and cast a long algorithmic shadow,
Watch this space.
|Read "The lawsuit that could rewrite the rules of AI copyright" Here|
Like what you're reading? Find it informative and insightful? You can sponsor the Rebel Tech Newsletter and follow on LinkedIn.
DATA CONSCIENCE CORNER
We All Count is a project to increase equity in data science by writing articles, fostering communities, and training individuals using its Data Equity Framework. The Data Equity Framework is a systematic process that provides a set of tools, checklists, and practices that allow you and your team to make choices intentionally. In a way that achieves the equity goals, you and your team have identified for your work.
The core of We All Count’s philosophy is it’s a way of thinking about and engaging with equity in data projects from start to finish. It breaks down data work into meaningful, manageable steps with their own unique equity issues and solutions.
|Learn more about We All Count here!|
A WORD FOR BLACK WOMEN IN DATA
We're on the cusp of what I've deemed 'Black Women's History Months'. See in the Untied States, February is designated as Black History Month and March is designed as Women's History Month. I've combined the 58 days (Feb 1 - Mar 31) to be Black women's self-imposed public celebration. Being a Black woman in America, at any time in America's history, has its slate of unreasonable superhuman level of expectations. IMHO, we deserve these two months to be cherish and amplified. And, honestly, we're asked to be on every panel, fireside chat and any other event planning committee in February and March. So why shouldn't we receive the gratitude and coins during this time?
Remember that you are a gift.
And sis, did you get your BWD Summit tix?!? I don't want you to have FOMO in September when prices are $699 (in-person) and you see the swag -- yes, the YETI is coming back. That's all I'll share, for now!
Black Women in Data Summit
September 23-24, 2023
ATL | Online
Early Early Bird Tix Available NOW til Feb 15th
$399 In-Person | $79 Online
|Grab your EEB BWD Summit tix!|
And sis, join as a BWD Annual member ($350 investment)! You'll get a complementary tickets to the BWD Summit 2023 and we'll add 3 virtual tickets to our sponsor-a-sis pool. BWD Annual also gives you free access the conversations, worksheets, resources and other BWD events. We're gonna thrive in this data industry together. First monthly BWD Community event of 2023 is happening January 29th -- "Saying NO to Unpaid Work". I'm sharing all the tea I've got on how I say 'NO' and don't feel (as) guilty about it.
Daily-ish rest routine suggestion: Clear your mind by listening to the sounds you can create on the Atmosphere app -- it's free, btw. They offer a variety of environments from beach to east Asian that you can find the combination of sounds to sooth your active mind. Set it for as minutes as you need to relax and restore.
Follow us on social
Data Conscience Book Tour | Women in Data Podcast (Out Now!)
My episode on the Women in Data's Podcast "Data Bytes" has been released. The Women in Data book club is an opportunity for WiD members to come together every month to discuss books focused on data, feminism, diversity and more.
In this episode we talk about her journey and tips for writing a book and explore measures that can be taken to reduce the risk of AI bias.
|Listen to the Episode Here!|
|Join the Data Conscience Book Tour|
Book Tour | CSTA's Black Excellence in Data and Computer Science Webinar
During the upcoming Black History Month in February 2023, I will be participating in Computer Science Teachers Association's featured webinar, which will focus on celebrating Black Excellence in Data/Computer Science.
I'll be providing my perspective on how social justice intersects with data science. Join the conversation as I provide my take on how we as educators can impact students and how we together can improve the world of tech.
CLICK HERE to grab your copy of my book, Data Conscience: Algorithmic Siege on our Humanity!
LAUGHING IS GOOD FOR THE SOUL
Stay Rebel Techie,
Thanks for subscribing! If you like what you read or use it as a resource, please share the newsletter signup with three friends!