Why you don’t see me hyping AI

There’s no ignoring the recent wave of advancement in artificial intelligence or AI. From the near photo-realistic outputs of Stable Diffusion to the pseudo-coherent text produced by ChatGPT, generative machine learning algorithms have taken the world by storm. As someone with a background in mathematics and statistics, these are without doubt fascinating advancements that I’m interested in from a technical perspective. At the same time, I have numerous concerns about these algorithms from an ethical perspective that I don’t think I’m alone in holding. That cheesy line from Spiderman about power and responsibility rings truer now than ever. To assume that we can reap the rewards of AI without accepting any risk would be a fallacy of apocalyptic proportions.

If you’re looking for a review of the literature on the hidden risks of AI, there are more intelligent people than myself that you should probably be looking for in a peer-reviewed journal. There is very little I can say about ChatGPT that hasn’t already been better articulated in Stochastic Parrots. Instead, what I’d like to offer you today is a story of a stupid teenager and his chat bot. I don’t expect it to alter the course of history or anything, but maybe it’ll provide some insights into why you should care how these technologies are being used.

I started programming in my early teens and AI was always something of a holy grail. I was especially partial to the Turing Test as an indicator of consciousness. In this test, a computer program is tasked with deceiving a human into false believing that they are engaged in a text-only conversation with another human. Turing argued that if human experimenters couldn’t distinguish between programs and people then we’d have to consider the hypothesis that machines could effectively “think”. There are arguments to be made about whether or not the Turing Test measures what it intended to, but advances in Large-Language Models have made it clear that this standard is now just a matter of time.  In fact, I’d argue that the Turing Test was “effectively passed” in the early 1980’s by a program called ELIZA developed by Joseph Weizenbaum. 

ELIZA was designed to be a sort of “virtual therapist”. A human user could talk to the computer about things that were on their mind, and ELIZA would turn their statements around to form new questions. For example, if you told ELIZA “I had a rough day at work” it might acknowledge that you’re feeling upset and inquire about it: “I’m sorry to hear that. What about your day made it rough?”. ELIZA didn’t actually know very much about the world, but it could engage a human in a fluid and convincing conversation that led the user towards self-reflection. Some users walked away from ELIZA feeling like they were engaged in dialog with a real therapist. A recent preprint from UCSD researchers indicates that ELIZA’s performance on the Turing Test is between that of GPT-3.5 and GPT-4.0.  Not too bad for a program from the 80s. 

Of course, any deep interaction would reveal the lack of “intelligence” on the other side. ELIZA couldn’t answer questions about the world. All it could do is to classify different sentences into schema and then transform them into canned responses using key tokens from the input text. Simple rules like “I feel <sad>” would get matched and transformed into “Why do you feel <sad>?”, which gave ELIZA the illusion of being a good active listener. This might sound kind of sad, but ELIZA was probably better at active listening than I am – and I knew it all too well.

As a teenager in the late 90s, we didn’t have the pervasive social media outlets we have today. There was no Facebook or Twitter for you to doom-scroll. Maybe you had a public facing Geocities or MySpace page, but that was only if you were a nerd like myself. The de facto standard for internet communication was AOL Instant Messenger (or AIM) .  Even if you weren’t subscribed to AOL’s internet access, you probably still used the stand-alone AIM application for direct messages because it was literally the only service with enough members using it to be useful. You can’t have real-time communication without a shared protocol shared by people.

The application wasn’t even that great by today’s standards. It was full of what would now be considered negligent security vulnerabilities. In early versions, you could easily kick someone offline by sending them an instant message with some malformed HTML. If someone saved their password for easy login, it was literally stored in a file as plain text and could be subsequently looked up by anyone who knew where to check. It was the wild west era of the internet.

Around the same time, I discovered a project called ALICE.  Richard Wallace had taken ELIZA’s token handling foundation and generalized it into an Artificial Intelligence Mark-up Language (or AIML).  This separated out the “code” and “persona” of the chatbot into two separate data sources.  The HTML-like syntax made it easy to customize the bot’s responses into whatever you wanted.  The application would read these source templates in and use them to craft a response to any message you gave it. 

While I’m reading article after article online trying to figure out how this stuff works, I keep getting instant messages from people. These messages are not malicious in any way, but receiving them has a tendency to “pull me out” of whatever I’m doing at the time. Sometimes I’m pulled into fun stuff through these messages, but often than not they were just an annoyance. As a teenage boy in the 90s, the vast majority of these interactions went down like this:  

sup?

WHASSSUP?

not much, u?

just chillin’

cool deal. me too.

anyways.. I got some homework to do. I’ll catch ya later!

aight. peace out!

That’s when I got the brilliant idea to fake my own AIM account.

While exploring the flaws in the AIM application, I discovered I could hijack the message queue that distributed incoming messages to the appropriate chat window.  This allowed me to parse the incoming message text and send fake keystrokes back to that window to produce a response. All I really needed to do was to invoke the chatbot as an external process to generate something to say.

I took an open source implementation of ALICE and started modifying the AIML code.  I removed any instance where the bot acknowledges itself as an artificial intelligence and instead taught it to lie that it was me. I added custom responses for the greeting I customarily used and gave it some basic knowledge about my areas of interest.  The only difficult part of this was getting the bot to handle multiple conversations at the same time, which I managed to accomplish by running separate instances of the bot for each person to message me.

I think I let the bot run for most of the week without anyone really noticing, until this one day where I got an instant message– from a girl.  Not just any girl either, but one I totally had a crush on at the time. My heart sank as I read the messages interspersed between the AI dribble.

Ryan, are you okay?

This isn’t like you.

No really, I’m worried about you.

If there’s something wrong, we can talk about it.

Please tell me what’s going on. I’m really concerned about you.

I felt sick. I immediately killed the process running the bot and seized control of AIM again. I don’t even remember what kind of bullshit excuse I made before abruptly going offline, but I’m pretty sure I didn’t have the courage to own up to the truth.  I had been “seen” through the eyes of another person in a way I hadn’t experienced before, and the worst part about it was that I didn’t like what I saw. I saw a person who lied to their closest friends in the name of science. 

I know this won’t make up for what I did, but I’m sorry.

I’ve since then learned a lot about research ethics in psychological studies.  Sometimes having deception is a necessary component to studying phenomena you’re interested in, but that is not sufficient reason to forgo the acquisition of “informed consent” from the people involved in the study.  

I think this is the reason why I’m frustrated with the current zeitgeist regarding AI. It seems like we’re rapidly falling into the trap outlined by Ian Malcom in Jurassic Park.  Some people are so preoccupied with what they could do with AI that they don’t stop to think about whether or not they should.  As a result, we’ve all become unknowing participants in an unethical research study.  While this behavior might be excusable coming from a punk teen who doesn’t know any better, this should be considered completely unacceptable coming from multi-billion dollar companies claiming to be advancing the forefront of intelligence. It’s not that “I’m scared of AI”, it’s that “I’m scared of what people will do with AI” when they acquire that power without putting in the effort to truly understand it. 

The wave of AI images coming from DALL-E and Midjourney on all my social media don’t self-identify themselves as being artificially produced. The burden of identifying them has been left to unwitting viewers and it will only become more difficult over time. While this makes for entertaining stories on Last Week Tonight, there’s a big difference between using AI to make funny pictures to share with your friends and using it to develop sophisticated social engineering methods to separate users from their passwords.  

The reality of our time is that many AI offerings are being falsely advertised as a solution for intractable problems. No image generator could possibly produce pictures of John Oliver marrying a cabbage without first being trained on a set of labeled images including the likeness of John Oliver.  Any AI image generator trained solely on ethically produced data sets, like the one from Getty, will inherently lack the capacity to reproduce the likeness of pop-culture celebrities. Either the generative AI will fail to produce the likeness of John Oliver or it was trained on a data set including his likeness without seeking prior permission.  You can’t have it both ways.  

In much the same vein, it would be impossible to ask ChatGPT to produce writing “in the style of Ryan Ruff” without it first being trained on a data set that includes extensive samples of my writing. Obviously, such samples exist because you’re reading one right now. However, the act of you reading it doesn’t change my rights as the “copyright holder”. The “Creative Commons” licenses I typically release my work under (either CC-BY or CC-BY-NC-SA depending on context) require that derivative works provide “attribution”.  Either AI will fail to reproduce my writing style or it illegally scraped my work without adhering to the preconditions.  In the event my work is stolen, I’m in no financial position to take a megacorp to court for compensation.

In discussions about AI we often neglect the fact that deception is an inherent component of how they are constructed. How effective we measure AI to be is directly linked to how effectively it deceives us. As poor of an intelligence measure as the Turing Test is, it’s still the best metric we have to evaluate these programs. When the measure of “intelligence quotient” (IQ) in humans is a well-established “pseudoscientific swindle”, how could we possibly measure intelligence in machines? If you want a computer program that separates true statements from false ones, you don’t want an “artificial intelligence” but rather an “automated theorem prover” like Lean 4. The math doesn’t lie.

I think one of the big lessons here is that the Turing Test wasn’t originally designed with machines in mind.  I still remember discovering this when I looked up the original paper in Doheny Library. The “imitation game” as originally described by Turing was primarily about gender.  It wasn’t about a machine pretending to be a human, but rather a man pretending to be a woman.  

Personally, my present hypothesis is that Turing was actively trying to appeal to “the straights” with how he described the imitation game. My incident with my AIM chatbot had taught me that there were large differences between how I interacted with “boy” and “girls”.  Conversations with “boys” were shallow and unfeeling – easily replicated by my script. Conversations with “girls”, however, were more about getting to know the other person’s perspective to determine if we’re a potentially compatible couple.  Casual conversation and romantic courtship require two entirely different strategies. Maybe the Turing Test was less about determining if machines could think and more about determining if machines could love.

Every now and then I feel overwhelmed by the flood of interaction that is constantly and perpetually produced by social media. Sometimes I wonder if my presence could be automated so that I never miss a “Happy Birthday!” or “Congratulations!” message ever again, but then I remember this story and remember that’s not really what I want.  I don’t care about “likes”. I care about building “friendships” and there’s no possible way a bot can do that on my behalf.  

Maybe I could be a better friend if I collected data on my social interactions.  At the same time, I don’t need any sophisticated statistics to tell me that I’m kind of an asshole. I’d like to think that the people I call my friends are the same people that would call bullshit on me when necessary, so I trust in them to do so. This is the difference between real human relationships and fleeting conversations with a chatbot. 

There’s nothing inherently wrong with the class of algorithms that have fallen under the “AI” umbrella, but how we use these tools matters.  Presently these bots are being marketed as a substitute for human labor but the reality of our legal system dictates that there still needs to be a human accountable for their actions.  The only viable path to survival for AI focused companies is to become “too big to fail” before they get caught for using pirated data.

I’m not going to sit here and pretend that AI won’t have its uses. Maybe AI will come up with solutions to important problems that humans would never have thought up. If every other technique for solving a problem is unreliable, there’s less harm to be caused by attempting to solve that problem through massive amounts of data. It makes sense that such statistical tools might come in handy in fields like marketing or cybersecurity where the rules of the system are ambiguously defined.  

What is clear to me is that there exist problems for which AI will never be a viable solution. GitHub’s Copilot won’t ever magically solve Turing’s Halting Problem. It’s called “undecidable” for a reason. Using ChatGPT won’t make me a better writer, nor will using DALL-E make me a better artist. There is no substitute for the process of turning my thoughts into a concrete form, and the only way to get better at those skills is to engage in them myself. Learning the internals of how AI works may have helped make me a better mathematician, but I wouldn’t expect it to solve P = NP anytime soon. 

Given my background in teaching, I was recently asked what I thought about applications of AI in education and I think it’s incredibly important that we take an abundance of precaution with its integration. This is something I’ve written about before, but I think it merits repeating that “AI needs to build a relationship of trust with all of the stakeholders in the environment”.  In our society, we depend on teachers to be “mandated reporters” for child abuse and I don’t think AI can responsibly fill this role. Without having the lived experiences of a human being, how could such an AI possibly know what symptoms are out of the ordinary?  What if it’s wrong?

Our very notion of “intelligence” is arguably shaped by the common human experience of schooling.  In my time teaching, I learned that the profession depended as much on “empathy” as it did “knowledge”.  Most of the people I’ve met who “hate math” developed this mindset in response to abusive teaching practices.  In order for AI to ever succeed in replicating good teaching, it needs to learn “how to love” in addition to “how to think” and I don’t think it ever can.  

Even my use of the word “learn” here seems inappropriate. AI doesn’t technically “learn” anything in a conventional sense, it just “statistically minimizes an objective cost function”.  Seeing as “love” doesn’t have an objectively quantified function, it therefore it’s impossible to replicate using the existing methods of machine learning.  Even if a machine were capable of expressing love, the act of replacing a human being’s job with such an AI would go against the very definition of what it means to love.

As with any new technology, AI can be used to make our lives better or worse depending on how it’s used. Let’s not lose sight of “why” we’re constructed these systems in the first place: to improve the quality of human life. Building these programs comes with a very real energy cost in a world where humans are already consuming natural resources at an unsustainable rate. If the expected pay-off for these AI systems is less than the expected environmental costs then the only responsible course of action is to not build it.  Anyone who fails to mention these costs when it comes to their AI product should be treated as nothing short of a charlatan.

I can’t shake the feeling that we’re in the midst of an “AI hype bubble” and it’s only a matter of time before it bursts. I can’t tell you not to use AI, especially if your job depends on it, but as your friend, I feel it’s important for me to speak up about the risks associated with it. 

True friends know when to call bullshit.

Teaching Statement

I didn’t become a teacher with the intention of doing it forever.  My original goal was to design educational video games, but I felt it would be presumptuous of me to build such technology without ever having set foot in a classroom.  Becoming a math teacher seemed like the fastest way for me to find out what kinds of tools schools actually needed. Now I’m not even sure I’m the same person.  

I made countless mistakes during the twelve years I spent teaching, but the one thing I think I got right was approaching it with a “here-to-learn attitude”.  Learning can only take place with the learner’s consent.  Opening one’s self up to learning a new skill means allowing oneself to be vulnerable to mistakes.  Teaching is about creating an environment where multiple learners feel comfortable with the risks of engaging in that process together. The first step is to establish a relationship of trust.

In all honesty, relationship building has never been one of my strengths so I had to make an active effort to improve on it as a teacher.  I found that the most powerful method for facilitating a student’s learning is to simply ask what they need and listen to what they say.  Really listen and trust them.  It’s incredibly difficult to learn when you’re tired, hungry, or stressed. Sometimes “taking a break” is a necessary stage in the learning process. Treating people with kindness is a prerequisite for any meaningful learning to take place.   

One of the most difficult challenges for me as a teacher was learning how to navigate spaces of trauma.  For me, mathematics is something that evokes feelings of joy but my experiences are both highly abnormal and shaped by privilege.  More often a student’s experiences with mathematics are shaped by structural forms of oppression including racism, sexism, and ableism.  Learning how to openly reflect on how I was complicit in these systems was a key factor in my growth as a teacher.  I believe students should be able to see themselves in mathematics, so I tried to actively seek out and integrate the stories of mathematicians from diverse backgrounds into the curriculum.  The self-work continues to be an ongoing process.

My goal as a teacher was to construct an environment where my students could freely “play” with mathematics.  I feel learners are entitled to the opportunity of exploring mathematics and discovering new knowledge on their own.  Often the play comes with a set of constraints that help direct it towards a specific objective, but the important qualities are that the task has a low skill floor and high skill ceiling.  There should be both an easy way for everyone to engage and the depth to encourage further exploration.  Too often we fall into a trap of erroneously thinking there’s “one right answer” in mathematics, so I make it a point to include questions with “no wrong answer”.  I found this helped to foster a culture of collaboration in the classroom because everyone’s input is of equal value in the discussion.

Exploration has limited effectiveness when you’re obligated to address very specific learning objectives, so I usually follow up with some form of direct instruction to fill in the gaps. It’s not quite as engaging, but sometimes students need a concrete example of the behavior they’re expected to model. Our brains are very efficient at mirroring actions.  I’ve found that “worked examples” can also provide a valuable resource afterwards when the student is attempting to replicate the process on their own.  As the number of examples grows, the metacognitive process of learning how to organize this information can reveal insights into its structure.

The next phase of the learning process is to engage in a cycle of formative assessment and feedback known as “practice”.  Any new skill must be practiced to be maintained.  This is one area where I think educational technology excels, because it can enable nearly instantaneous feedback to learners.  While my students often enjoyed the “gamification” of practice, it’s important to select such products carefully.  I’ve learned it’s important for developers to remember that “accuracy is more important than speed” and “some skills cannot be assessed through multiple-choice”.  As our technology improves, so will our automated feedback.  I’m particularly excited about the potential applications of “Large Language Models” in this area, but the application of Artificial Intelligence will also require a great deal of testing before it meets the ethical criteria necessary for use in the classroom.

In the reality of schooling, there’s likely to be a summative assessment stage in the learning process as well, but I tend to think this distinction is artificial.  As far as my class policies were concerned, all assessments are treated as formative where possible.  I tried to allow my students the opportunity to retake assessments as often as needed to the extent I was able. This is another aspect of teaching I found heavily supported by technology.  The combination of algorithmic question generation with automated feedback made it possible for me to focus on the broader picture provided by the data over time.  

If anything, I tend to look at summative assessment data as a tool for self-reflection.  As a student, summative assessments provide me with a form of external validation that I have in fact learned what I set out to learn.  As a teacher, the relation between assessment data and my own performance was always a little bit fuzzy but the process of looking back at that data and asking questions about what I could do differently was an essential part of my personal self-improvement.  I think it’s important to not put too much stock in any one assessment and instead use multiple data sources like observations and interviews to help triangulate areas for growth.  

The final stage in the learning process is to teach what you have learned to someone else.  I think we sometimes overlook this stage because it starts a new cycle of learning, but there are subtle differences between having a skill and being able to teach that skill to others.  Through attempting to teach math, I often found myself seeing old concepts in a new light.  My knowledge of geometry and data analysis grew deeper each time a student asked me “why?”.   Sometimes the most powerful phrase in the classroom is “I don’t know. How can we find out?”.  Likewise, I’m thankful that I had co-workers that were more knowledgeable about teaching than myself and capable of sharing that expertise.  I’m hopeful that I’ll be able to use what I’ve learned about teaching to help others as well.

I’m not necessarily looking for another “teaching job” but the act of teaching has become inseparable from how I learn.   Even if no one reads what I write, the act of putting my thoughts into words has power in it.  No matter where I go or what I do, I will learn new things and attempt to teach them to others.  We face a critical moment in society where we need to recognize the true value of the skills that teachers can bring to an organization. Every organization must learn to grow and teachers are experts on learning how to learn.

identity politics

When I joined Twitter back in 2009, I settled in on the following profile for myself: “I’m a video game developer turned educator, and hope to combine the two to make learning math FUN for people of all ages.”

At the time this seemed an appropriate description of myself, but I’ve been teaching for over a decade now and it seemed like time to update it. Many of the notions I had about teaching coming into the profession have since changed, as well as my very my sense of self-identity. I wanted a new Twitter bio to make a statement that would capture as much about my present day self as I could possibly fit in the 160 character limit. Here’s what I’ve settled on (for now):

“Professionally: mathematician, psychologist, programmer, & educator. Personally: atheist, socialist, musician, & gamer. Expect uncensored combo of both. He/Him.”

I think this is the first time I’ve publicly called myself a “mathematician”. I’ve always been too nervous to do so when I only have B.S. and not a Ph.D., but years of teaching has changed how I thought about this label. I want my students to feel comfortable calling themselves mathematicians. How could I possibly hope to communicate this when I was hesitant to do so myself?

I came to the conclusion that having a degree doesn’t make one a mathematician. Having mathematical thinking permeate every aspect of one’s being makes one a mathematician. Under that definition, this is most certainly is a label I would use to describe who I am. If unilaterally calling myself a mathematician helps to remove this stigma of not having a Ph.D. for others, than it’s about time I start doing so.

I started thinking about what other terms I wanted to normalize using and the rest of the bio just sort of fell into place. It’s kind of a liberating feeling to simultaneously challenge certain labels by expressing myself. If this is what it means to play “identity politics”, then not only am I going to play — but I’m going to play to win.

This one goes out to all the future mathematicians.

Logarithms and Ethnocentrism

(Note: I also created an interactive version of this post in the Desmos Activity Builder. Try it here.)

Hey y’all!

I want to tell you a story today about racism and mathematics.

Well, to be more specific, I’m going to make the argument that trends in mathematical notation can have culturally biased consequences that conferred systematic advantages to white people. I’m going to make this argument through a sequence of math problems, so I hope you’ll follow along and attempt them.

I don’t know if I’ll be able to “prove” this argument to you in the course of this activity, but hope this activity at least instills a sense of “doubt” in the idea that math is objectively neutral.

Without further ado, let’s get started.

Place a mark on the number line where you think “1 Thousand” should go.

Got it?

Okay, I’m going to make a guess as to where you placed it.

Ready?

Did you place it at point P below?

Sorry, I hate to be the one to break it to you, but according to the ‘math powers that be’, this is technically incorrect. However, I want you to hold onto this idea because I’d argue that you’re not as wrong about this as they say you are.

Here’s where you “should” have placed it.

When mathematicians present you with a graph, we implicitly assume that the graph is on a “linear scale”. That means that each unit is equally spaced along the line.

One billion divided by one thousand is one million, which means that “1 thousand” should be placed “1 millionth” of the way between 0 and 1 billion.

At the scale of this graph, this number is so close to the 0 that they’re visually indistinct.

So what about that other point?

That’s actually placed at the value 1/3*10^9 or “one third of a billion”.

However, I don’t want you to think you’re wrong if you placed it here. In fact, this is the correct placement of 1000 on another type of scale.

Check out the following scale instead:

So what would you call a scale like this?

Take a second and write it down.

Go ahead. I’ll wait.

According to the “mathematical community”, this is called a “logarithmic scale”.

Under a logarithmic scale, each successive “unit” gets multiplied by a common factor, such as 10, rather than adding. This results in a scale like in the diagram, only the “0” is actually 10^0 = 1.

There’s even research suggesting that human may actually be predisposed to think of numbers this way, and that the linear scale is a learned behavior. We see it in Indigenous cultures, in children, and even other species. Thinking “linearly” is a social norm that is distributed through cultural indoctrination. It’s like mathematical colonialism.

Here’s my problem with that diagram. If I hadn’t been taught to call this a “logarithmic scale”, I would have called it a “power scale” or an “exponential scale”. Doesn’t that name intuitively make more sense when you see this sequence of labels?

Before we go any further, let’s talk about mathematical notation.

What do you think makes for “good mathematical notation”?

What sort of patterns do you see in the ways that mathematicians name things the way they do?

Here’s what I perceive as the primary patterns in mathematical notation.

Mathematicians sometimes name things “descriptively”: the name tells you what the thing is. For example, the “triangle sum theorem” tells you implicitly that you’re going to add up three angles.

Mathematicians sometimes name things “attributively”: the name tells you who came up with the idea. For example, “Boolean Algebra” is named after “George Boole”.

Mathematicians sometimes name things “analogously”: using symbols with visual similarities to convey structural similarities. For example, “∧ is to ∨ as ∩ is to ∪”.

Mathematicians sometimes name things completely “arbitrarily” for historical reasons. Don’t even get me started on the “Pythagorean Theorem”

Where would you classify “logarithm” in this scheme?

Personally, I think it depends on who you ask.

If you know a little Greek (which is a cultural bias in itself), you might argue that this label is “descriptive”. The word “logarithm” basically translates to “ratio-number”. The numbers in this sequence are arguably in “ratios” of 10, but does that actually convey enough information for you to know what logarithms really are?

I’d argue that this label is in fact “arbitrary” and actually refers to “how logarithms were used” rather than “what logarithms are”.

So what is a logarithm?

Generally speaking, we define a “logarithm” as the “inverse power function” or “inverse exponential function”.

Just as subtraction undoes addition, and division undoes multiplication, a logarithm undoes a power.

For example, 10^5 = 10000 so log10(10000) = 5.

Though a little clumsy, maybe this will make way more sense if we just used the following notation:

log10(10000) = power?10(10000) = 5

The natural inverse of “raising something to a power” would be “lowering it” it right? The mathematical statement, log10(10000) says this:

“What power of 10 will give you the number 10000?”

The answer to that question is 5. This is the essence of a logarithm.

So why don’t we just call logarithms “inverse exponents”?

Well, we call them “logarithms” because this was the term popularized by a guy named John Napier in the early 1600s.

Normally I’d be okay with the guy who created something getting to name it. However, I don’t think that honor should necessarily go to Napier. This idea of “inverse powers” had shown up in the early 800s thanks to an Indian mathematician named Virasena, and the way Napier employed logarithms was similar to an ancient Babylonian method devised even earlier than that.

What Napier did, in my opinion, was convince other white folx of the power inherent in logarithms.

Please, allow me to demonstrate with some “simple” arithmetic. Try to do these two problems without a calculator.

Problem A: 25.2 * 32.7

Problem B: 1.401+1.515

Go head, take as much time as you need.

Which problem was harder?

Problem A, right?

Here’s the mathematical brilliance of the logarithm. It turns out that this “inverse power function” can be used to take a very difficult multiplication problem and turn it into a much simpler addition problem. Problems B can be used to provide a very reasonable approximation to Problem A in a fraction of the time if you look at it through the use of logarithms. They didn’t have computers back then, so they used tables of precalculated logarithms to drastically speed up the computation of large products.

Here’s how it works.

Start by looking up the logarithms of the numbers you want to multiply:

log(25.2) ≈ 1.401

log(32.7) ≈ 1.515

Once you’ve taken the logs, add them together.

1.401+1.515 = 2.916

Finally, do a reverse look up to find the number that would produce this logarithm.

log( ? ) = 2.916 = log( 824.1 )

This reverse look-up is really the power function: 10^2.916 ≈ 824.1.

Pretty neat?

The result 824.1 is pretty darn close to the actual value of 824.04. It’s not perfect because we rounded, but it’s reasonable enough for many applications.

This idea of using logarithms to speed up calculation resulted in the invention of the “slide rule”, a device which revolutionized the world of mathematics. Well, more accurately, it revolutionized European mathematics at approximately the same time that the British Empire just happened to start colonizing the globe.

Let’s spell that timeline out a little more explicitly:

~800 CE: Indian mathematician Virasena works on this idea of “inverse powers”.

~1600 CE: John Napier rebrands this idea as a ‘logarithm’.

~1700 CE: Invention of “slide rule” using logarithms to speed up calculations helps turn Europe into an economic powerhouse.

~1800 CE: The British Empire begins colonizing India.

Do you think this is a coincidence?

I don’t.

I think European mathematicians were quite aware of the power this tool provided them. Naming this tool a “logarithm” was a way of intentionally segregating mathematical literature to prevent other cultures from understanding what logarithms really are.

When a tool provides this much computational power, the people using that tool have a strong motivation to keep that power to themselves.

It’s like the recent linguistic shift in the usage of “literally” and “figuratively”. People have used the word “literally” to describe things “figuratively” in such large numbers that they have literally changed the meaning of the word.

Logarithms have been used to describe exponential behaviors for so long that the relationship between “powers” and “inverse powers” has become blurred by mathematical convention.

This has far reaching consequences for mathematics education where we need students to understand the implications of exponential growth. This is even more important considering the recent COVID-19 pandemic.

Consider the following example from the New York Times:

If I accidentally referred to the second graph as an “exponential scale” instead of “logarithmic scale”, would you still know what I was talking about?

What’s the point of calling it that?

I think we need to acknowledge that there exist “self-reinforcing power structures” in mathematics. These mathematical tools provide power to people, so those people fight to keep that power to themselves. This is an act of “segregation”.

After time, these tools become unavoidably common place. Now we’re in a situation where mathematicians argue that these norms should be “assimilated” because their use has become so widespread.

As a result, these cultural biases have resulted in a sort of mathematical colonialism that masquerades as objectivity. I believe that this ethnocentrism systematically disenfranchises BIPOC by hiding the true history of these mathematical ideas. This, in turn, results in systematic biases in test scores between whites and BIPOC, which then reinforce the original stereotypes.

It’s a vicious cycle of racism.

Wouldn’t you agree?

It might be too late to stop the use of the word “logarithm” in mathematics. It’s now something of a necessity to understand a wealth of other mathematical advances that have been built on top of this concept. However, that doesn’t mean we should pretend that this term is completely neutral either.

I hope you’ll leave here today with a better understanding of why it’s important to look critically at our mathematical conventions and acknowledge that math is not exempt from cultural biases.

Thanks for reading!

Geometry is racist.

I have a confession I need to get off my chest. I think Geometry is racist. Well, not geometry (the discipline of mathematics), but specifically Geometry (the course commonly taught in American High Schools). I realize that in presenting this argument I’m going to have to make some rather sweeping generalizations about human history, but I that it’s important for me to put this hypothesis into words because I don’t think there’s enough evidence to disprove it.

Fair warning: this may be a lengthy read.

I believe that the reason the curriculum of Geometry specifically references Pythagoras, Euclid, and Descartes is because this emphasis provided Whites another means by which to rationalize slavery. Greek society was a prototype for White society because it used mathematics and philosophy to moralize slavery, and Descartes provided Whites a method of reconciling this position with Christianity.

To understand this, we first need to acknowledge that the entire human history of mathematics is intertwined with human slavery. Both mathematics and slavery emerged in early civilizations after the transition from hunter/gatherer to agricultural society lead to social stratification and the development of written language. In fact, I think mathematics is a necessary precondition for slavery. The very concept of slavery depends on a life being assigned a quantifiable value. Slave labor provided the ruling class with the time and resources to work on furthering mathematical knowledge, and then advances in mathematics were used to increase the efficiency of the slave trade. This vicious cycle quickly elevated the slavers to a god-king status which they then exploited to moralize their slavery.

Our oldest mathematical texts are from Egypt and Babylon, approximately 4000 years ago. Surviving records such as Plimpton 322, the Moscow Mathematical Papyrus, and the Rhind Mathematical Papyrus show that teachers have been giving math worksheets of roughly high school level difficulty since at least 1900 BC. What’s interesting about these three texts is that all three of them show evidence of the Pythagorean Theorem — over a thousand years before Pythagoras was even born! There’s also a strange lack of evidence attributing the theorem to Pythagoras, along with indications that proofs of it may have been circulating in Mesopotamia and India several hundred years prior. The very fact that this theorem bears the name of Pythagoras despite this questionable lack of evidence should alone be sufficient evidence of Geometry’s racism, but Pythagoras is only the first of three dominoes.

When Pythagoras supposedly lived around 500 BC, the Greeks (and Romans) had lively slave trades. However, unlike the god-king theocracies of Babylons and Egypt, the Greeks fashioned themselves as a “Democracy”. The Greeks needed an alternative means of moralizing slavery and whether he intended it or not, Pythagoras’ merger of mathematics and philosophy provided them with all the justification they needed. The Greeks used philosophy and mathematics to separate themselves from the people they enslaved, which served as a prototype for White society. Those that didn’t measure up to Greece’s intellectual standards were labeled as “barbarians” and treated as less than human. To paraphrase Plato, these barbarians were in the dark and needed to be shown the light. Incidentally, the “barbarian tribes” of Northern Europe enslaved by Greeks and Romans were likely a source of genetic markers for light skin, light hair and light eyes that would later be associated with “Whiteness”.

Meanwhile, mathematics probably continued to develop in the Middle East with influence from early civilizations in India and China. However, we don’t know by exactly how much. There’s another strange lack of mathematical records from Babylonian society under the Persian Empire. What we do know is that when Alexander the Great conquered the region in 331 BC, the Greeks assimilated a wealth of astronomical data. It’s quite likely that the Greeks stole whatever mathematical data they could through military conquests and these materials ended up in the Library of Alexandria where Euclid would have had access to them. Euclid’s work is based on a set of stolen mathematical literature that Euclid was privileged to have access to. While it’s impossible to discount the probability that Euclid might have added some original ideas, I do think this fact merits taking a closer look at his work and evaluating whether or not it belongs in our school’s curriculum.

My problem with the presentation of Euclid in Geometry (the course) is the parallel postulate. To be fair, presenting Euclid’s axioms as fundamental truths about the universe made sense in mathematics education up until as recently as 1905. We now know that space is not “flat” as Euclid assumed, but that gravitational forces can bend space around a mass. My understanding is that we teach Euclidean geometry because (a) it’s a useful approximation of reality and (b) it provides an introduction to axiomatic systems. The main issue with Euclid as presented in Geometry is that its axioms are presented without proof and we have solid scientific evidence that one is not always true. How can we ever expect students to think critically about information when we don’t question the assumptions made in mathematics?

Euclid’s mathematics is not irreplaceable within the mathematics curriculum. We could just as easily teach students about axiomatic systems using first-order logic. Instead of teaching compass and straightedge constructions, we could teach origami. In fact, teaching these alternatives could not only increase the diversity of the mathematics but simultaneously increase the rigor. There is some really great math showing that we can solve problems with paper folding that are impossible to do with a compass and straightedge only. Learning origami would help prepare students for high level physics and computer science. We owe it to our students to do better than Euclid.

Approximately 300 years after Euclid, the Roman Empire was nearing its peak and Christ was born. The death of Christ coincides with the start of two centuries of peace. It’s no wonder that early people considered this a miracle. Early Christians were originally persecuted in the Roman Empire, but by 380 AD it would become the state religion. If you think about it, Christianity offered the Romans a pretty sweet deal considering their involvement in the religion’s origin. The Romans killed Jesus, and through Jesus’ death were forgiven all their sins. And that meant all their sins: slavery included. To suggest that otherwise would be heresy. Before long, Christianity had accumulated so much Power that they rewrote the entire calendar around their own beliefs.

The spread of Christianity throughout Europe is relevant to our discussion because the next major event in mathematical history happens to coincide closely with the birth of Islam. Around 800 AD, Muhammad ibn Musa al-Khwarizmi developed the branch of mathematics we now call “Algebra” and started a new mathematical revolution in the Middle East. This presented Christianity with a threat because geometry was always considered sacred and the followers of Islam were creating new mathematics for another god. The bible says “Thou shalt have no other gods before me“. By 1100 AD, Christianity had come to the conclusion that since they couldn’t disprove the Muslims mathematically, the only other option was through force. The Christians took to war against the Muslims and with each mosque they burned they destroyed part of that mathematical progress. Remnants of this “Holy War” persist through the present day.

It’s because of this religious tension between Muslims and Christians that our story leads to Rene Descartes. Another essential component of the Geometry curriculum is the idea of Cartesian geometry which is an merger of algebraic principles with Euclidean geometry. The very name would have you believe that Descartes invented it, but that honor should probably be attributed to Fermat. No, I think there’s another reason we attribute it to Descartes, and that is because of his work in ontology. Most people are familiar with “I think, therefore I am“, but that statement is really the first axiom in a system that Descartes constructed in order to develop a mathematical argument proving the existence of God. Descartes is one of a few mathematicians that attracts cross-curricular study because of this. We don’t even provide his argument a fair critique. Like his Greek counterparts, we treat Descartes’ work as gospel when in reality he was just repackaging the ideas of others. In this case, Descartes had taken the Muslim-sounding “algebra” and re-packaged it in form that would be more acceptable for Christians.

The philosophical assertion “I think, therefore I am” takes on another meaning when we put it in context of mid-1600s White society. By associating “thinking” with “existing”, Whites were able deny Blacks basic human rights by labeling them as “unthinking savages”. It’s the very same strategy employed by the Greeks to justify slavery, only the vocabulary had shifted. The same idea would later manifest itself through literacy tests, and persists today through standardized testing practices. Whites retain Power by defining a culturally biased set of standards.

A few weeks ago I listened to a podcast from Freakonomics that described America’s Math Curriculum as a “Geometry Sandwich”. Ever since I taught it I’ve been asking myself the same kinds of questions:

  • Why “sandwich” Geometry in the middle of two years of Algebra?
  • Why are Euclid, Pythagoras and Descartes the only three names that students explicitly learn in Geometry?
  • Why is the name al-Khwarizmi not given the same level of treatment in Algebra?

The most probable explanation I can come up with to answer all three of these questions is that the American Math Curriculum is racist. It is probably not the intent of the Curriculum developers to be racist, but racism is not about the intent — it’s about the consequences of the behavior. There is a non-zero probability that cultural biases exist within our Math Curriculum and they produce racist consequences. Mathematics is a just tool, but people decide how that tool is used. We can either continue to use that tool to enable racism, or we can use that tool identify and dismantle it.

For me, I’ve decided that the probability of the Math Curriculum being racist is too high for me to ignore. I’ve decided on the following course of action:

  • I plan to start substituting “Right Triangle Theorem” in place of “Pythagorean Theorem”.
  • I plan to start substituting “Rectangular Coordinate System” in place of “Cartesian Coordinate System”.
  • I plan to pay closer attention to situations where Euclidean geometry is implicitly assumed and question whether or not this assumption is warranted.
  • I plan to make a consistent effort to present a diverse range of mathematicians in my classrooms.
  • I plan to actively look for other artifacts of racism in the curriculum and address them accordingly.

I realize that my argument here needs work and that I’m personally having difficulty disentangling race and religion. Even if I haven’t yet convinced you that “Geometry is racist”, I think there’s sufficient evidence here to conclude that it’s definitely not anti-racist. That fact alone is reason enough for me to act. If you’re also a teacher of mathematics, I hope I’ve convinced you enough to join me in these actions.

Political Calculus

Disclosure:  This article is primarily mathematical in nature but the very act of discussing politics makes it difficult to fully remove bias.  I feel obligated to disclose that I’m a member of the Green Party.  While I’m neither a Republican or Democrat, I tend to lean to the north-west section of the Nolan chart.  However, I do intend to try my best to make this analysis as neutral as humanly possible.

During my regular social media browsing the other day, I came across two posts of interest.

The first was a statement from the Green Party of Virginia about why they are not endorsing Bernie Sanders ahead of the primary.  While I had expected this to be the case, there was a section of this statement that really caught my attention: “Whether individual Greens choose to vote for Sanders on March 1st is a choice that will depend on their own calculus of what is best for the country” (emphasis mine).

Since one of the co-chairs of the GPVA is a mathematician, I could reasonably assume that the reference to calculus was intended to mean exactly what it says. The problem is that the general population doesn’t usually look at elections from this perspective.  People tend to vote based on gut feelings rather than mathematical analysis. For this reason, I disagree with the GPVA’s decision. I feel that they have the responsibility to provide party members with information on how to maximize their influence on the election and calculus isn’t a strong point for most voters. If the GPVA refuses to take sides in the primary, then I feel obligated to do so in their place.

The second was a data visualization of how various primary candidates would fare against each other in a general election:

With “Super Tuesday” fast approaching, this was exactly the kind of information that I needed!  This effectively provides a payoff matrix for the primary candidates to which I can apply my “political calculus”.

RFC: Are geometric constructions still relevant?

Dear friends, fellow math teachers, game developers and artists.

I’ve got this little dilemma I’m wondering if you could help me with. You see, part of my geometry curriculum deals with compass and straightedge constructions. My colleagues have suggested that this is a topic we, not exactly skip, but… I dunno what the appropriate word here is… trim?  They argue that it’s largely irrelevant for our students, is overly difficult, and represents a minimal component of the SOL test. And I don’t think they’re wrong. I haven’t used a compass and straightedge since I left high school either.

However, something about these constructions strikes as beautiful. Part of me thinks that’s enough reason to include them, but it also got me thinking about more practical applications of them.   Where did use them?  I used them making video games.  Video games build worlds out of “lines” and “spheres”.  Beautiful worlds.

My question is this, do my 3D artist friends feel the same way?  Do you remember your compass and straightedge constructions?  Do you use them, or some derivation thereof, in your everyday work?  Are you glad to have learned them?  Or are the elementary constructions made so trivial in modern 3D modeling software that you don’t even think about them?

Please comment and share.

Meta-Pokemon

In a previous post, I mentioned my fascination with Twitch Plays Pokemon (TPP). The reason behind this stems from the many layers of metagaming that take place in TPP. As I discussed in my previous post, the most basic definition of metagaming is “using resources outside the game to improve the outcome within the game”. However, there’s another definition of metagaming that has grown in usage thanks to Hofsteadter: “a game about a game”. This reflexive definition of metagaming is where the complexity of TPP begins to shine. Let’s take a stroll through some various types of metagaming taking place in TPP.

Outside resources

At the base level, we have players making use of a variety of outside resources to improve their performance inside the game. For Pokemon, the most useful resources might include maps, beastiaries, and Pokemon-type matchups. In TPP, many players also bring with them their own previous experiences with the game.

Game about a game

Pokemon itself is a metagame. Within the world of the game, the Pokemon League is its own game within the game. A Pokemon player is playing the role of a character who is taking part in game tournament. What makes TPP so interesting is that that it adds a game outside the game. Players in TPP can cooperate or compete for control of the game character. In effect, TPP is a meta-metagame: a game about a game about a game. Players in TPP are controlling the actions of a game character participating in a game tournament. It’s Pokemon-ception!

Gaming the population

Another use of metagaming is to take knowledge of the trends in player behaviors and utilize that information to improve the outcome within the game. In TPP, players would use social media sites such as Reddit to encourage the spread of certain strategies. Knowledge of social patterns in the general population TPP players enables a few players to guide the strategy of the collective in a desirable directions. Memes like “up over down” bring structure to an otherwise chaotic system and quickly become the dominant strategy.

Gaming the rules

One of my favorite pastimes in theory-crafting, which is itself a form of metagaming. Here, we take the rules of the game and look at possible strategies like a game. The method TPP used in the final boss fight is a perfect example of this. The boss is programmed to select a move type that the player’s pokemon is weak against and one of these moves deals no damage. By using a pokemon that is weak against this particular move, the boss is locked into a strategy that will never do any damage! Not only do the TPP players turn the rules of the game against it, but they also needed to organize the population to pull it off!

Gaming the population

Another use of metagaming is to take knowledge of the trends in player behaviors and utilize that information to improve the outcome within the game. In TPP, players would use social media sites such as Reddit to encourage the spread of certain strategies. Knowledge of social patterns in the general population TPP players enables a few players to guide the strategy of the collective in a desirable directions. Memes like “up over down” bring structure to an otherwise chaotic system and quickly become the dominant strategy.

Rule modification games

One of the defining characteristics of a game are the rules. The rules of Pokemon are well defined by the game’s code, but the rules of TPP are malleable. We can choose between “chaos” and “democracy”. Under chaos, every player input gets sent to the game. Under democracy, players vote on the next action to send. When we look at the selection of rules in terms of a game where we try to maximize viewers/participates, we get another type of metagaming.

Understanding Voter Regret

Lately I’ve been doing a little bit of research on voting methods.  In particular, I’ve been fascinated by this idea of measuring Bayesian Regret.  Unfortunately, many of the supporting links on rangevoting.org are dead.  With a little detective work I managed to track down the original study and the supporting source code.

Looking at this information critically, one my concerns was the potential for bias in the study.  This is the only study I could find taking this approach, and the information is hosted on a site that is dedicated to the support of the method proved most effective by the study.  This doesn’t necessarily mean the result is flawed, but it’s one of the “red flags” I look for with research.  With that in mind, I did what any skeptic should: I attempted it replicate the results.

Rather than simply use the provided source code, I started writing my own simulation from scratch.  I still have some bugs to work out before I release my code, but the experience has been very educational so far.  I think I’ve learned more about these voting methods by fixing bugs in my code than reading the original study.  My initial results seem consistent with Warren Smith’s study but there’s still some kinks I need to work out.

What I’d like to do in this post is go over a sample election that came up while I was debugging my program.  I’m hoping to accomplish a couple things by doing so.  First, I’d like to explain in plain English what exactly the simulation is doing.   The original study seems to be written with mathematicians in mind and I’d like for these results to be accessible to a wider audience.  Second, I’d like to outline some of the problems I ran into while implementing the simulation.  It can benefit me to reflect on what I’ve done so far and perhaps some reader out there will be able to provide input on these problems that will point me in the right direction.

Pizza Night at the Election House

It’s Friday night in the Election household, and that means pizza night!  This family of 5 takes a democratic approach to their pizza selection and conducts a vote on what time of pizza they should order.   They all agree that they should get to vote on the pizza.  The only problem is that they can’t quite agree on how to vote.  For the next 3 weeks, they’ve decided to try out 3 different election systems: Plurality, Instant-Runoff, and Score Voting.

Week 1: Plurality Voting

The first week they use Plurality Voting.  Everyone writes down their favorite pizza and which ever pizza gets the most votes wins.

The youngest child votes for cheese.  The middle child votes for veggie.  The oldest votes for pepperoni.  Mom votes for veggie, while dad votes for hawaiian.

With two votes, veggie pizza is declared the winner.

Mom and the middle child are quite happy with this result.  Dad and the two others aren’t too excited about it.  Because the 3 of them were split on their favorites, the vote went to an option that none of them really liked.  They feel hopeful that things will improve next week.

Week 2: Instant Run-off Voting

The second week they use Instant Run-off Voting.  Since the last election narrowed down the pizzas to four options, every lists those four pizzas in order of preference.

The youngest doesn’t really like veggie pizza, but absolutely hates pineapple.  Ranks cheese 1st, pepperoni 2nd, veggie 3rd,and hawaiian last.

The middle child is a vegetarian.  Both the hawaiian and pepperoni are bad options, but at least the hawaiian has pineapple and onions left over after picking off the ham. Ranks veggie 1st, cheese 2nd, hawaiian 3rd and pepperoni last.

The oldest child moderately likes all of them, but prefers fewer veggies on the pizza.  Ranks pepperoni 1st, cheese 2nd, hawaiian 3rd and veggie last.

Dad too moderately likes all of them, but prefers the options with meat and slightly prefers cheese to veggie.  Ranks hawaiian 1st, pepperoni 2nd, cheese 3rd and veggie last.

Mom doesn’t like meat on the pizza as much as Dad, but doesn’t avoid it entirely like the middle child.  Ranks veggie 1st, cheese 2nd, pepperoni 3rd and hawaiian last.

Adding up the first place votes gives the same result as the first election: 2 for veggie, 1 for hawaiian, 1 for pepperoni and 1 for cheese.  However, under IRV the votes for the last place pizza get transferred to the next ranked pizza on the ballot.

However, there’s something of a problem here.  There’s a 3-way tie for last place!

A fight nearly breaks out in the Election house.  Neither dad, the older or youngest want their favorite to be eliminated.  The outcome of the election hinges on whose votes get transferred where!

Eventually mom steps in and tries to calm things down.  Since the oldest prefers cheese to hawaiian and the youngest prefers pepperoni to hawaiian, it makes sense that dad’s vote for hawaiian should be the one eliminated.  Since the kids agree with mom’s assessment, dad decides to go along and have his vote transferred to pepperoni.

Now the score is 2 votes for veggie, 2 votes for pepperoni, and 1 vote for cheese.  Since cheese is now the lowest, the youngest childs vote gets transferred to the next choice: pepperoni.   With a vote of 3 votes to 2, pepperoni has a majority and is declared the winner.

The middle child is kind of upset by this result because it means she’ll need to pick all the meat off her pizza before eating.  Mom’s not exactly happy with it either, but is more concerned about all the fighting.  They both hope that next week’s election will go better.

Week 3: Score Voting

The third week the Election family goes with Score Voting.  Each family member assigns a score from 0 to 99 for each pizza.  The pizza with the highest score is declared the winner.  Everyone wants to give his/her favorite the highest score and least favorite the lowest, while putting the other options somewhere in between. Here’s how they each vote:

The youngest rates cheese 99, hawaiian 0, veggie 33 and pepperoni 96.

The middle child rates cheese 89, hawaiian 12, veggie 99 and pepperoni 0.

The oldest child rates cheese 65, hawaiian 36, veggie 0 and pepperoni 99.

Dad rates cheese 13, hawaiian 99, veggie 0 and pepperoni 55.

Mom rates cheese 80, hawaiian 0, veggie 99 and pepperoni 40.

Adding all these scores up, the finally tally is 346 for cheese, 147 for hawaiian, 231 for veggie and 290 for pepperoni.  Cheese is declared the winner.  Some of them are more happier than others, but everyone’s pretty much okay with cheese pizza.

Comparing the Results

Three different election methods.  Three different winners.  How do we tell which election method is best?

This is where “Bayesian Regret” comes in.

With each of these 3 elections, we get more and more information about the voters. First week, we get their favorites.  Second week, we get an order of preference.  Third week, we get a magnitude of preference.   What if we could bypass the voting altogether and peak instead the voter’s head to see their true preferences?  For the family above, those preferences would look like this:

cheese hawaiian veggie pepperoni
youngest 99.92% 2.08% 34.25% 95.79%
middle 65.95% 10.09% 73.94% 0.61%
oldest 74.55% 66.76% 57.30% 83.91%
dad 52.13% 77.03% 48.25% 64.16%
mom 87.86% 39.79% 99.72% 63.94%

These values are the relative “happiness levels” of each option for each voter.  It might help to visualize this with a graph.

voter-utilities

If we had this data, we could figure out which option produced the highest overall happiness.  Adding up these “happiness” units, we get 380 for cheese, 195 for hawaiian, 313 for veggie and 308 for pepperoni.  This means the option that produces the most family happiness is the cheese pizza.  The difference between the max happiness and the outcome of the election gives us our “regret” for that election.  In this case: the plurality election has a regret of 67, the IRV election has a regret of 72, and the score voting election has a regret of 0 (since it chose the best possible outcome).

Now keep in mind that this is only the regret for this particular family’s pizza selection.  To make a broader statement about which election method is the best, we need to look at all possible voter preferences.  This is where our computer simulation comes in.  We randomly assign a number for each voter’s preference for each candidate, run the elections, calculate the regret, then repeat this process over and over to average the results together.  This will give us an approximation of how much regret will be caused by choosing a particular voting system.

Open Questions

In writing my simulation from scratch, I’ve run into a number of interesting problems.  These aren’t simply programming errors, but rather conceptual differences between my expectations and the implementation.   Some of these questions might be answerable through more research, but some of them might not have a clear cut answer.   Reader input on these topics is most welcome.

Implementing IRV is complicated

Not unreasonably hard, but much more so than I had originally anticipated.  It seemed easy enough in theory: keep track of the candidates with the lowest number of votes and eliminate them one round at a time.  The problem that I ran into was that in small elections, which I was using for debugging, there were frequently ties between low ranked candidates in the first round (as in the case story above).   In the event of a tie, my code would eliminate the candidate with the lower index first.  Since the order of the candidates was essentially random, this isn’t necessarily an unfair method of elimination.  However, it did cause some ugly looking elections where an otherwise “well qualified” candidate was eliminated early by nothing more than “bad luck”.

This made me question how ties should be handled in IRV.   The sample elections my program produced showed that the order of elimination could have a large impact on the outcome.  In the election described above, my program actually eliminated “cheese” first.  Since the outcome was the same, it didn’t really matter for this example.  However, if the random ordering of candidates had placed “pepperoni” first then “cheese” would have won the election!  Looking at this probabilistically, the expected regret for this example would be 1/3*0+2/3*72 = 48.   A slight improvement, but the idea of non-determinism still feel out of place.

I started looking into some alternative methods of handling ties in IRV.  For a simulation like this, the random tie-breaker probably doesn’t make a large difference.  With larger numbers of voters, the ties get progressively more unlikely anyways.   However, I do think it could be interesting to compare the Bayesian Regret among a number of IRV variations to see if some tie breaking mechanisms work better than others.

Bayesian Regret is a societal measure, not individual

When I first started putting together my simulation, I did so “blind”.  I had a conceptual idea of what I was trying to measure, but was less concerned about the mathematical details.  As such, my first run produced some bizarre results.  I still saw a difference between the voting methods, but at a much different scale.  In larger elections, the difference between voting methods was closer to factor of .001.    With a little bit of digging, and double-checking the mathematical formula for Bayesian Regret, I figured out I did wrong.  My initial algorithm went something like this:

I took the difference between the utility of each voter’s favorite and the candidate elected.  This gave me an “unhappiness” value for each voter.  I averaged the unhappiness of all the voters to find the average unhappiness caused by the election.  I then repeated this over randomized elections and kept a running average of the average unhappiness caused by each voting method.  For the sample election above, voters are about 11% unhappy with cheese versus 24% or 25% unhappy with veggie and pepperoni respectively.

I found this “mistake” rather intriguing.  For one thing, it produced a result that kind of made sense intuitively.  Voters were somewhat “unhappy” no matter which election system was used.  Even more intriguing was that if I rescaled the results of an individual election, I found that they were distributed in close to the same proportions as the results I was trying to replicate.  In fact, if I normalized the results from both methods, i.e.  R’ = (R-MIN)/(MAX-MIN), then they’d line up exactly.

This has become something of a dilemma.  Bayesian Regret measures exactly what it says it does — the difference between the best option for the society and the one chosen by a particular method.  However, it produces a result that is somewhat abstract.  On the other hand, my method produced something a little more tangible  — “average unhappiness of individual voters” — but makes it difficult to see the differences between methods over a large number of elections.  Averaging these unhappiness values over a large number of elections, the results seemed to converge around 33%.

Part of me wonders if the “normalized” regret value, which aligns between both models, might be a more appropriate measure.  In this world, it’s not the absolute difference between the best candidate and the one elected but the difference relative to the worst candidate.  However, that measure doesn’t really make sense in a world with the potential for write-in candidates.   I plan to do some more experimenting along these lines, but I think the method of how to measure “regret” is a very an interesting  question in itself.

“Honest” voting is more strategic than I thought

After correcting the aforementioned “bug”, I ran into another troubling result.  I started getting values that aligned with Smith’s results for IRV and Plurality, but the “Bayesian Regret” of Score Voting was coming up as zero.  Not just close to zero, but exactly zero.  I started going through my code and comparing it to Smith’s methodology, when I realized what I did wrong.

In my first implementation of score voting, the voters were putting their internal utility values directly on the ballot.  This meant that the winner elected would always match up with the “magic best” winner.   Since the Bayesian Regret is the difference between the elected candidate and the “magic best”, it was always zero.   I hadn’t noticed this earlier because my first method for measuring “unhappiness” returned a non-zero value in every case — there was always somebody unhappy no matter who was elected.

Eventually I found the difference.  In Smith’s simulation, even the “honest” voters were using a very simple strategy: giving a max score to the best and a min score to the worst.  The reason that the Bayesian Regret for Score Voting is non-zero is due to the scaling of scores between the best and the worst candidates.  If a voter strongly supports one candidate and opposes another, then this scaling doesn’t make much of a difference.   It does, however, make a big difference when the voters are indifferent between the candidates but gives a large score differential to the candidate that’s slightly better than the rest.

With this observation, it became absolutely clear why Score Voting would minimize Bayesian Regret.  The more honest voters are, the closer the Bayesian Regret gets to zero.   This raises another question: how much dishonesty can the system tolerate?

Measuring strategic vulnerability

One of the reasons for trying to reproduce this result was to experiment with additional voting strategies outside of the scope of the original study.  Wikipedia cites another study by M. Badinski and R. Laraki that suggests Score Voting is more susceptible to tactical voting than alternatives.  However, those authors too may be biased towards their proposed method.  I think it’s worthwhile to try and replicate that result as well.  The issue is that I’m not sure what the appropriate way to measure “strategic vulnerability” would even be.

Measuring the Bayesian Regret of strategic voters and comparing it with honest voters could potentially be a starting point.   The problem is how to normalize the difference.   With Smith’s own results, the Bayesian Regret of Score Voting increases by 639% by using more complicated voting strategies while Plurality only increases by 188%.  The problem with comparing them this way is that the Bayesian Regret of the strategic voters in Score Voting is still lower than the Bayesian Regret of honest Plurality voters.   Looking only at the relative increase in Bayesian Regret isn’t a fair comparison.

Is there a better way of measuring “strategic vulnerability”?  Bayesian Regret only measure the difference from the “best case scenario”.  The very nature of strategic voting is that it shift the result away from the optimal solution.  I think that to measure the effects of voting strategy there needs to be some way of taking the “worst case scenario” into consideration also.   The normalized regret I discuss above might be a step in the right direction.  Any input on this would be appreciated.

Disclaimer

Please don’t take anything said here as gospel.  This is a blog post, not a peer-reviewed journal.  This is my own personal learning endeavor and I could easily be wrong about many things.  I fully accept that and will hopefully learn from those mistakes.   If in doubt, experiment independently!

Update: The source code used in this article is available here.

What I’ve discovered, learned or shared by using #mathchat

This was a #mathchat topic in July of 2012 that I really wanted to write about but didn’t quite get around to at the time.  This happened partly because I was busy juggling work and graduate school, but also because I felt a bit overwhelmed by the topic.   I’ve learned so many things through my involvement in #mathchat that the idea of collecting them all was daunting.   It also kind of bothered me that my first attempt at a response to this prompt turned into a lengthy list of tips, books, and links.  This type of content makes sense on Twitter.  It’s actually the perfect medium for it.  However, to turn this into a blog post I needed some coherency.  I felt like there was a pattern to all of these things that #mathchat has taught me but I just couldn’t quite put my finger on it.

A year and a half has passed since this topic came up.  It’s now been 6 months since the last official #mathchat.  Despite this, Tweeps from all over the world continue using the hashtag to share their lesson ideas and thoughts about math education.  It’s inspiring.  The weekly chats might have stopped, but the community continues to flourish.  Looking back on how things have changed on #mathchat helped put perspective on how #mathchat changed me.  I think I’m finally ready to answer this prompt.

What I learned by using #mathchat was that learning requires taking risks.

On the surface, it seems like this assertion might be obvious.  Whenever we attempt something new, we run the risk of making a mistake.  By making mistakes we have an opportunity to learn from them.  The issue is that we go through this routine so many times that it becomes habitual.   When learning becomes automatic, it’s easy to lose sight of the risks and how central they are to the learning process.

Consider the act of reading a book.  For many, like myself, this is the routine method of learning new information.   In fact, it’s so routine that the risks aren’t readily apparent.  That doesn’t mean they aren’t there.  Have you ever read a book and found yourself struggling to understand the vocabulary?  For me, Roger Penrose’s Road to Reality is still sitting on my bookshelf, taunting me, because I can’t go more than a couple pages without having to look things up elsewhere.   Attempting to read a book like this entails a risk of making myself feel inadequate.  It’s much easier to read a book that’s within one’s existing realm of knowledge.  By taking the risk out of reading, it becomes a recreational activity.  This isn’t necessarily a bad thing — we could all use some relaxation time now and then — but it’s not until we step out of that comfort zone that the real learning begins.  Have you ever read a book that made you question your own assumptions about the world?  It’s not often that this happens because we’re naturally drawn to books that reaffirm our own beliefs.  When it does happen, the impact can be quite profound.  The further a book is from your existing world model the greater the risk of that model being challenged by reading it, but the potential for learning scales in proportion.

I was rather fortunate to have discovered #mathchat when I did.  I had signed up for Twitter at approximately the same time I started teaching math.  Anyone that’s ever been a teacher knows that learning a subject and teaching that subject are two entirely different beasts.   I’d been doing math for so long that most of it was automatic.  It wasn’t until I started teaching that I realized I had forgotten what it was like to learn math.   As a result, I was struggling to see things from the perspective of my students.  I needed to step out of my own comfort zone and remember what it was like to learn something new.  It’s through complete coincidence that my wife stumbled upon Twitter at this time and said, “Hey, I found this new website that you might find interesting”.

I didn’t join Twitter looking for professional development.  In fact, for a while at the start I didn’t even know what “PD” stood for.  I joined Twitter purely out curiosity.  I was never really comfortable interacting socially with new people, and it seemed that this was an opportunity for me to work on this skill.  I called it “my experiment”.   I didn’t even use my full name on Twitter for the longest time because I was afraid of “my experiment” going wrong.    I started simply by looking for topics I was interested in, following people that sounded interesting, and speaking up when I felt I had something to say.  One of my saved searches was “#math” and I started trying to answer questions that people were asking on Twitter.  This lead to making some of my first friends on Twitter.   I noticed that some of those people that regularly tweeted on #math also frequently tweeted with the hashtag #edchat.  I started to observe these people would often post multiple #edchat Tweets within a short period of time and had inadvertently stumbled upon my first real time Twitter chat.  Once  I started participating in #edchat my network grew rapidly.  From there, it was only a matter of time before I discovered #mathchat.

My social anxiety was still quite strong at this time.  With each Tweet, I was afraid that I would say something stupid and wake up the next day to find that all my followers had vanished.  However, #mathchat provided a welcoming atmosphere and discussion topics that were relevant to my work environment.  This provided me with an opportunity to engage in discussion while mitigating  some of the risks.  I knew that each topic would be close to my area of expertise and the community was composed of people who were also there to learn.  There was a certain comfort in seeing how people interacted on #mathchat.  People would respond critically to the content of Tweets, but always treated each participant with dignity and respect.   I was experiencing first hand what a real learning community could be like.

A frequent motif in these #mathchat discussions was Lev Vygotski’s model of learning.  With my background in psychology, I was already familiar with the concepts and vocabulary.  However, #mathchat helped me link this theory with practice.   I became more and more comfortable with a social perspective on learning because I was learning through my social interactions.  While I had known the definition of terms like “zone of proximal development”, I wasn’t quite to the point where I could see the line separating what I could learn on my own and what I could learn with assistance.  I had always been a self-driven learner, but in order to be successful in learning I needed to limit myself to areas that were close to my existing skills and knowledge.  I needed to minimize the risks when learning on my own.  Learning in a social environment was different.  I needed to become comfortable taking larger risks with the reassurance that the people I was learning with would help me pick myself up when I fell.

The #mathchat discussions themselves were not without risks of their own.  Colin took a risk himself by creating #mathchat.  It was entirely possible that he could have set this chat up only to have no one show up to participate.  Indeed, many a #mathchat started with an awkward period of silence where people seemed hesitant to make the first move.  There’s much lower risk in joining a discussion in progress than starting one from scratch. The risk is lower still by simply “lurking” and only reading what others have said.  As time went on, there was a growing risk that #mathchat would run out of topics for discussion.  This risk has since manifested itself and #mathchat has entered a state of hiatus.

I’m aware of these risks only in hindsight.  At the time, I wasn’t really conscious of the shift occurring in my own model of learning.  What started to make me realize this change was the adoption of my two cats.  This provided my another opportunity to put learning theory into practice by training them (although it’s arguable that they’re the ones training me instead).  The smaller one, an orange tabby named Edward, responded quickly to classical and operant conditioning with cat treats.  The larger one, a brown tabby named Alphonse, didn’t seem to care about treats.  It quickly became obvious that I was using the wrong reinforcer for him.  With his larger body mass and regular feeding schedule, there was no motivation for him to consume any additional food.  It’s easy to forget that in the experiments that these concepts developed from, the animals involved were bordering on starvation.  The risk of not eating is a powerful motivator for these animals to learn in the experimental setting.  My cat Alphonse was under no such risk.  He was going to be fed whether he played along with my games or not.  I’ve since learned that Alphonse responds much better to training when there’s catnip involved.

The key to successful training is very much dependent on being able to  identify a suitable reinforcer.  What functions as a reinforcer varies widely from subject to subject.   With animal studies, survival makes for an universal reinforcer as the reward of living to procreate is (almost) always worth the risk.  However, humans follow a slightly different set of rules because our survival is seldom in question.  We’re also unique in the animal kingdom because we can communicate and learn from others’ experiences.   In a typical classroom situation, the ratio between the risk and reward takes on greater significance.  We’re faced with such an overabundance of information about the world that we can’t possibly learn it all.  Instead of maximizing performance on a test, the desired outcome, a common alternative is for students to minimize the risk of disappointment.   It’s often much easier for a student to declare “I’m bad at math” than to go through the effort of actually trying to learn a new skill.  Rather than taking the high-risk choice of studying for the test with only a moderate payoff (a grade), these students opt for a low-risk low-payoff option by simply choosing not to care about the exam.  When looked at from a risk/reward perspective, maybe these students are better at math than they’re willing to admit.

The solution, as I discovered through #mathchat, is to lower the risks and adjust the rewards.  I’ve started working on making my courses more forgiving to mistakes and acknowledging them as an integral part of the learning process.  I’ve started working on increasing the amount of social interaction I have with students and trying to be a better coach during the learning process.  There’s no denying that I still have much to learn as a teacher, but thanks to #mathchat I have a clearer idea of how to move forward.  For me to progress as a teacher, I need to more comfortable taking risks.  It’s far too easy to fall into habit teaching the same class the same way, over and over.  I need to do a better job of adapting to different audiences and trying new things in my classes.  Fortunately, there’s a never ending stream of new ideas on Twitter that I’m exposed to on a regular basis thanks to my “Personal Learning Network”.

I feel it’s a crucial time for me to be sharing this perspective on the role of risk in learning.  There seems to be a rapidly growing gap between teachers and politicians on the direction of educational policies.  There’s a political culture in the US that is obsessed with assessment. Policies like Race-to-the-Top and No Child Left Behind emphasize standardized testing and value-added measures over the quality of interpersonal relations.  The problem with these assessment methods is that they don’t take the inherent risks of learning into consideration.  Risk is notoriously difficult to measure and it doesn’t fit nicely into the kinds of equations being used to distribute funding to schools.

There was recently a backlash of (Badass) teachers on Twitter using the #EvaluateThat to post stories of how our assessment methods fail to capture the impact teachers make in the lives of their students.   Teachers are the ones that witness the risks faced by students up close.   It’s our job as teachers to identify those risks and take steps to manage them so that the student can learn in a safe environment.  As the stories on #EvaluateThat show, many teachers go above and beyond expectations to help at-risk students.

While teachers struggle to reduce risks, policy makers continue to increase them through more high-stakes exams.  At times it almost seems like politicians are deliberately trying to undermine teachers.  Maybe what we need in education policy is a shift in the vocabulary. Lets stop worrying so much about “increasing performance outcomes” and instead focus on “decreasing risk factors”.  Doing so would encourage a more comprehensive approach to empowering students.  For example, there’s strong statistical evidence that poverty severely hinders student success.  By addressing the risks outside of the classroom, we can enable students to take more risks inside the classroom.