There’s no ignoring the recent wave of advancement in artificial intelligence or AI. From the near photo-realistic outputs of Stable Diffusion to the pseudo-coherent text produced by ChatGPT, generative machine learning algorithms have taken the world by storm. As someone with a background in mathematics and statistics, these are without doubt fascinating advancements that I’m interested in from a technical perspective. At the same time, I have numerous concerns about these algorithms from an ethical perspective that I don’t think I’m alone in holding. That cheesy line from Spiderman about power and responsibility rings truer now than ever. To assume that we can reap the rewards of AI without accepting any risk would be a fallacy of apocalyptic proportions.

If you’re looking for a review of the literature on the hidden risks of AI, there are more intelligent people than myself that you should probably be looking for in a peer-reviewed journal. There is very little I can say about ChatGPT that hasn’t already been better articulated in Stochastic Parrots. Instead, what I’d like to offer you today is a story of a stupid teenager and his chat bot. I don’t expect it to alter the course of history or anything, but maybe it’ll provide some insights into why you should care how these technologies are being used.

I started programming in my early teens and AI was always something of a holy grail. I was especially partial to the Turing Test as an indicator of consciousness. In this test, a computer program is tasked with deceiving a human into false believing that they are engaged in a text-only conversation with another human. Turing argued that if human experimenters couldn’t distinguish between programs and people then we’d have to consider the hypothesis that machines could effectively “think”. There are arguments to be made about whether or not the Turing Test measures what it intended to, but advances in Large-Language Models have made it clear that this standard is now just a matter of time.  In fact, I’d argue that the Turing Test was “effectively passed” in the early 1980’s by a program called ELIZA developed by Joseph Weizenbaum. 

ELIZA was designed to be a sort of “virtual therapist”. A human user could talk to the computer about things that were on their mind, and ELIZA would turn their statements around to form new questions. For example, if you told ELIZA “I had a rough day at work” it might acknowledge that you’re feeling upset and inquire about it: “I’m sorry to hear that. What about your day made it rough?”. ELIZA didn’t actually know very much about the world, but it could engage a human in a fluid and convincing conversation that led the user towards self-reflection. Some users walked away from ELIZA feeling like they were engaged in dialog with a real therapist. A recent preprint from UCSD researchers indicates that ELIZA’s performance on the Turing Test is between that of GPT-3.5 and GPT-4.0.  Not too bad for a program from the 80s. 

Of course, any deep interaction would reveal the lack of “intelligence” on the other side. ELIZA couldn’t answer questions about the world. All it could do is to classify different sentences into schema and then transform them into canned responses using key tokens from the input text. Simple rules like “I feel <sad>” would get matched and transformed into “Why do you feel <sad>?”, which gave ELIZA the illusion of being a good active listener. This might sound kind of sad, but ELIZA was probably better at active listening than I am – and I knew it all too well.

As a teenager in the late 90s, we didn’t have the pervasive social media outlets we have today. There was no Facebook or Twitter for you to doom-scroll. Maybe you had a public facing Geocities or MySpace page, but that was only if you were a nerd like myself. The de facto standard for internet communication was AOL Instant Messenger (or AIM) .  Even if you weren’t subscribed to AOL’s internet access, you probably still used the stand-alone AIM application for direct messages because it was literally the only service with enough members using it to be useful. You can’t have real-time communication without a shared protocol shared by people.

The application wasn’t even that great by today’s standards. It was full of what would now be considered negligent security vulnerabilities. In early versions, you could easily kick someone offline by sending them an instant message with some malformed HTML. If someone saved their password for easy login, it was literally stored in a file as plain text and could be subsequently looked up by anyone who knew where to check. It was the wild west era of the internet.

Around the same time, I discovered a project called ALICE.  Richard Wallace had taken ELIZA’s token handling foundation and generalized it into an Artificial Intelligence Mark-up Language (or AIML).  This separated out the “code” and “persona” of the chatbot into two separate data sources.  The HTML-like syntax made it easy to customize the bot’s responses into whatever you wanted.  The application would read these source templates in and use them to craft a response to any message you gave it. 

While I’m reading article after article online trying to figure out how this stuff works, I keep getting instant messages from people. These messages are not malicious in any way, but receiving them has a tendency to “pull me out” of whatever I’m doing at the time. Sometimes I’m pulled into fun stuff through these messages, but often than not they were just an annoyance. As a teenage boy in the 90s, the vast majority of these interactions went down like this:  

sup?

WHASSSUP?

not much, u?

just chillin’

cool deal. me too.

anyways.. I got some homework to do. I’ll catch ya later!

aight. peace out!

That’s when I got the brilliant idea to fake my own AIM account.

While exploring the flaws in the AIM application, I discovered I could hijack the message queue that distributed incoming messages to the appropriate chat window.  This allowed me to parse the incoming message text and send fake keystrokes back to that window to produce a response. All I really needed to do was to invoke the chatbot as an external process to generate something to say.

I took an open source implementation of ALICE and started modifying the AIML code.  I removed any instance where the bot acknowledges itself as an artificial intelligence and instead taught it to lie that it was me. I added custom responses for the greeting I customarily used and gave it some basic knowledge about my areas of interest.  The only difficult part of this was getting the bot to handle multiple conversations at the same time, which I managed to accomplish by running separate instances of the bot for each person to message me.

I think I let the bot run for most of the week without anyone really noticing, until this one day where I got an instant message– from a girl.  Not just any girl either, but one I totally had a crush on at the time. My heart sank as I read the messages interspersed between the AI dribble.

Ryan, are you okay?

This isn’t like you.

No really, I’m worried about you.

If there’s something wrong, we can talk about it.

Please tell me what’s going on. I’m really concerned about you.

I felt sick. I immediately killed the process running the bot and seized control of AIM again. I don’t even remember what kind of bullshit excuse I made before abruptly going offline, but I’m pretty sure I didn’t have the courage to own up to the truth.  I had been “seen” through the eyes of another person in a way I hadn’t experienced before, and the worst part about it was that I didn’t like what I saw. I saw a person who lied to their closest friends in the name of science. 

I know this won’t make up for what I did, but I’m sorry.

I’ve since then learned a lot about research ethics in psychological studies.  Sometimes having deception is a necessary component to studying phenomena you’re interested in, but that is not sufficient reason to forgo the acquisition of “informed consent” from the people involved in the study.  

I think this is the reason why I’m frustrated with the current zeitgeist regarding AI. It seems like we’re rapidly falling into the trap outlined by Ian Malcom in Jurassic Park.  Some people are so preoccupied with what they could do with AI that they don’t stop to think about whether or not they should.  As a result, we’ve all become unknowing participants in an unethical research study.  While this behavior might be excusable coming from a punk teen who doesn’t know any better, this should be considered completely unacceptable coming from multi-billion dollar companies claiming to be advancing the forefront of intelligence. It’s not that “I’m scared of AI”, it’s that “I’m scared of what people will do with AI” when they acquire that power without putting in the effort to truly understand it. 

The wave of AI images coming from DALL-E and Midjourney on all my social media don’t self-identify themselves as being artificially produced. The burden of identifying them has been left to unwitting viewers and it will only become more difficult over time. While this makes for entertaining stories on Last Week Tonight, there’s a big difference between using AI to make funny pictures to share with your friends and using it to develop sophisticated social engineering methods to separate users from their passwords.  

The reality of our time is that many AI offerings are being falsely advertised as a solution for intractable problems. No image generator could possibly produce pictures of John Oliver marrying a cabbage without first being trained on a set of labeled images including the likeness of John Oliver.  Any AI image generator trained solely on ethically produced data sets, like the one from Getty, will inherently lack the capacity to reproduce the likeness of pop-culture celebrities. Either the generative AI will fail to produce the likeness of John Oliver or it was trained on a data set including his likeness without seeking prior permission.  You can’t have it both ways.  

In much the same vein, it would be impossible to ask ChatGPT to produce writing “in the style of Ryan Ruff” without it first being trained on a data set that includes extensive samples of my writing. Obviously, such samples exist because you’re reading one right now. However, the act of you reading it doesn’t change my rights as the “copyright holder”. The “Creative Commons” licenses I typically release my work under (either CC-BY or CC-BY-NC-SA depending on context) require that derivative works provide “attribution”.  Either AI will fail to reproduce my writing style or it illegally scraped my work without adhering to the preconditions.  In the event my work is stolen, I’m in no financial position to take a megacorp to court for compensation.

In discussions about AI we often neglect the fact that deception is an inherent component of how they are constructed. How effective we measure AI to be is directly linked to how effectively it deceives us. As poor of an intelligence measure as the Turing Test is, it’s still the best metric we have to evaluate these programs. When the measure of “intelligence quotient” (IQ) in humans is a well-established “pseudoscientific swindle”, how could we possibly measure intelligence in machines? If you want a computer program that separates true statements from false ones, you don’t want an “artificial intelligence” but rather an “automated theorem prover” like Lean 4. The math doesn’t lie.

I think one of the big lessons here is that the Turing Test wasn’t originally designed with machines in mind.  I still remember discovering this when I looked up the original paper in Doheny Library. The “imitation game” as originally described by Turing was primarily about gender.  It wasn’t about a machine pretending to be a human, but rather a man pretending to be a woman.  

Personally, my present hypothesis is that Turing was actively trying to appeal to “the straights” with how he described the imitation game. My incident with my AIM chatbot had taught me that there were large differences between how I interacted with “boy” and “girls”.  Conversations with “boys” were shallow and unfeeling – easily replicated by my script. Conversations with “girls”, however, were more about getting to know the other person’s perspective to determine if we’re a potentially compatible couple.  Casual conversation and romantic courtship require two entirely different strategies. Maybe the Turing Test was less about determining if machines could think and more about determining if machines could love.

Every now and then I feel overwhelmed by the flood of interaction that is constantly and perpetually produced by social media. Sometimes I wonder if my presence could be automated so that I never miss a “Happy Birthday!” or “Congratulations!” message ever again, but then I remember this story and remember that’s not really what I want.  I don’t care about “likes”. I care about building “friendships” and there’s no possible way a bot can do that on my behalf.  

Maybe I could be a better friend if I collected data on my social interactions.  At the same time, I don’t need any sophisticated statistics to tell me that I’m kind of an asshole. I’d like to think that the people I call my friends are the same people that would call bullshit on me when necessary, so I trust in them to do so. This is the difference between real human relationships and fleeting conversations with a chatbot. 

There’s nothing inherently wrong with the class of algorithms that have fallen under the “AI” umbrella, but how we use these tools matters.  Presently these bots are being marketed as a substitute for human labor but the reality of our legal system dictates that there still needs to be a human accountable for their actions.  The only viable path to survival for AI focused companies is to become “too big to fail” before they get caught for using pirated data.

I’m not going to sit here and pretend that AI won’t have its uses. Maybe AI will come up with solutions to important problems that humans would never have thought up. If every other technique for solving a problem is unreliable, there’s less harm to be caused by attempting to solve that problem through massive amounts of data. It makes sense that such statistical tools might come in handy in fields like marketing or cybersecurity where the rules of the system are ambiguously defined.  

What is clear to me is that there exist problems for which AI will never be a viable solution. GitHub’s Copilot won’t ever magically solve Turing’s Halting Problem. It’s called “undecidable” for a reason. Using ChatGPT won’t make me a better writer, nor will using DALL-E make me a better artist. There is no substitute for the process of turning my thoughts into a concrete form, and the only way to get better at those skills is to engage in them myself. Learning the internals of how AI works may have helped make me a better mathematician, but I wouldn’t expect it to solve P = NP anytime soon. 

Given my background in teaching, I was recently asked what I thought about applications of AI in education and I think it’s incredibly important that we take an abundance of precaution with its integration. This is something I’ve written about before, but I think it merits repeating that “AI needs to build a relationship of trust with all of the stakeholders in the environment”.  In our society, we depend on teachers to be “mandated reporters” for child abuse and I don’t think AI can responsibly fill this role. Without having the lived experiences of a human being, how could such an AI possibly know what symptoms are out of the ordinary?  What if it’s wrong?

Our very notion of “intelligence” is arguably shaped by the common human experience of schooling.  In my time teaching, I learned that the profession depended as much on “empathy” as it did “knowledge”.  Most of the people I’ve met who “hate math” developed this mindset in response to abusive teaching practices.  In order for AI to ever succeed in replicating good teaching, it needs to learn “how to love” in addition to “how to think” and I don’t think it ever can.  

Even my use of the word “learn” here seems inappropriate. AI doesn’t technically “learn” anything in a conventional sense, it just “statistically minimizes an objective cost function”.  Seeing as “love” doesn’t have an objectively quantified function, it therefore it’s impossible to replicate using the existing methods of machine learning.  Even if a machine were capable of expressing love, the act of replacing a human being’s job with such an AI would go against the very definition of what it means to love.

As with any new technology, AI can be used to make our lives better or worse depending on how it’s used. Let’s not lose sight of “why” we’re constructed these systems in the first place: to improve the quality of human life. Building these programs comes with a very real energy cost in a world where humans are already consuming natural resources at an unsustainable rate. If the expected pay-off for these AI systems is less than the expected environmental costs then the only responsible course of action is to not build it.  Anyone who fails to mention these costs when it comes to their AI product should be treated as nothing short of a charlatan.

I can’t shake the feeling that we’re in the midst of an “AI hype bubble” and it’s only a matter of time before it bursts. I can’t tell you not to use AI, especially if your job depends on it, but as your friend, I feel it’s important for me to speak up about the risks associated with it. 

True friends know when to call bullshit.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.