Fermat's Last Theorem

note: this blog post is about a pretty much random piece of number theory that had gone unsolved for over 300 years, but I will admit that it’s not that exciting for regular people. Especially this one, where I’m pretty much just stealing a wikipedia article because I found it interesting. It’s my blog I do what I want.

Fermat’s Last Theorem states that no three positive integers $a$, $b$, and $c$ satisfy the equation $a^{n} + b^{n} = c^{n}$ for any integer value of $n$ greater than $2$.

The cases $n = 1$ and $n = 2$ have been known to have infinitely many solutions and fermat specifically pointed that out when he gave us what little information he did.

The Story

Pierre de Fermat was a prominent french lawyer and mathematician. He used to keep a book of different ideas in math at the time and would scribble in the proofs on the edges of the page.

Fermat’s last theorem has a very interesting story. He was sitting at home alone one day by the fire sipping wine and eating cheese; reading everyone’s favorite math classic novella, The 1670 edition of Diophantus’ Arithmetica. This book contained a lot of the mathematical conjectures and theories that were believed at the time. He was reading this book and proving the conjectures within the margins of the pages!

One of the last theorems in the book was the one above, he had written that it could be proven, but that there wasn’t enough space in the margins of the pages at the time. (a terrible shame) So he never actually wrote down the proof; and unfortunately passwed away before writing it. Pierre de Fermat died aged 57 or 58, on January 12, 1665 in Castres, France. The cause of his death is not known. Three days before his death, he had been carrying out legal business in the local courthouse. He was buried in the Church of St. Dominique in Castres.

The problem

Now this left us with a pretty difficult problem, all of the other proofs in the book were correct, so we had reason to believe that this theorem was also true, but we had no way to prove it!

This set the math community ablaze.

With the special case $n = 4$ proved by Fermat himself, it suffices to prove the theorem for exponents $n$ that are prime numbers (this reduction is considered trivial to prove, mostly by angry math professors). Over the next two centuries (1637–1839), the conjecture was proved for only the primes $3$, $5$, and $7$, although Sophie Germain innovated and proved an approach that was relevant to an entire class of primes.

Ernst Kummer extended this and proved the theorem for all regular primes, leaving irregular primes to be analyzed individually. Building on Kummer’s work and using sophisticated computer studies (probably a big for loop tbh), other mathematicians were able to extend the proof to cover all prime exponents up to four million, but a proof for all exponents was inaccessible (meaning that it was either impossible, exceedingly difficult, or unachievable with current knowledge).

The mystery intensifies

Entirely separately, around 1955, Japanese mathematicians Goro Shimura and Yutaka Taniyama suspected a link might exist between elliptic curves and modular forms, two completely different areas of mathematics. Known at the time as the Taniyama–Shimura-Weil conjecture, and (eventually) as the modularity theorem, it stood on its own, with no apparent connection to Fermat’s Last Theorem. It was widely seen as significant and important in its own right, but was (like Fermat’s theorem) widely considered completely inaccessible to proof.

The last guy that would be tortured by this french troll.

Enter Andrew Wiles, he grew up with a childhood fascination with Fermat’s theorem (not a fun childhood in the opinion of this author). He had a background of working with elliptic curves and related fields, decided to try to prove the Taniyama–Shimura conjecture as a way to prove Fermat’s Last Theorem. In 1993, after six years working in secrecy on the problem, Wiles succeeded in proving enough of the conjecture to prove Fermat’s Last Theorem. Wiles’s paper was massive in size and scope. A flaw was discovered in one part of his original paper during peer review and required a further year and collaboration with a past student, Richard Taylor, to resolve. For his proof, Wiles received the 2016 Abel Prize.

To imagine dedicating your life to solving a problem, spending six years in secrecy only to have an error pointed out when publsihing the solution. When he discovered the missing piece and complete the proof after an additional year must have been incredible.

"I was sitting at my desk examining the Kolyvagin–Flach method. It wasn’t that I believed I could make it work, but I thought that at least I could explain why it didn’t work. Suddenly I had this incredible revelation. I realised that, the Kolyvagin–Flach method wasn’t working, but it was all I needed to make my original Iwasawa theory work from three years earlier. So out of the ashes of Kolyvagin–Flach seemed to rise the true answer to the problem. It was so indescribably beautiful; it was so simple and so elegant. I couldn’t understand how I’d missed it and I just stared at it in disbelief for twenty minutes. Then during the day I walked around the department, and I’d keep coming back to my desk looking to see if it was still there. It was still there. I couldn’t contain myself, I was so excited. It was the most important moment of my working life. Nothing I **ever do again** will mean as much."

Author image
  • — Andrew Wiles, as quoted by Simon Singh

This is the small but fun story of Fermat’s last theorem, formally proven and published in 1995, 358 years after it was first conceived; due to the dedication of Andrew Wiles.

note: Like I said, I didn’t have much to go on when it came to the second blog post I promised, but here it is. I hope the story behind this fun thing was relatively interesting, I didn’t have the time or the background to go into the proof here, but you can find the linked paper below. (it’s over 100 pages so have fun.)

Sources

The Modern Theory of Knowledge

What does it mean to know something?

We say all the time that we know things. I know that my friend owns a car, and I know that $2 + 2 = 4$

I should just say now that if you think that question of knowledge is uninteresting the blog post will not get any better for you.

Are those two pieces of information both knowledge?

They are both statements about something that we as people might say that we know, but is there a difference between those two types of knowledge?

Normally during these blog post, we try to take the time to define a bunch of concepts in order to know exactly what we’re dealing with in order to then make some intresting conclusions based oin the consequences of those definitions. The only problem here is that humanity currently doesn’t have a good definition for the word “knowledge”.

What does it mean to know something?

When we say that we know something, most people have a shared and unspoken understanding of what that really means. This blog post is going to attempt to determine what that sentence really means.

Much earlier people used to regard knowing something as having a true justified belief in that thing. Let’s roll with that for a bit. While we go through this I’ll try to justify how this came to be our definition.

To be more formal:

  • $p$ is true
  • $S$ believes that $p$
  • $S$ is justified in believing that $p$

The truth condition

This seems kind of obvious in a way, but most people seem to agree that you can’t know something that’s false. You can’t say “I know $2 + 2 = 5$ because that’s just not true.

Something’s truth does not require that anyone can know or prove that it is true. Not all truths are “established” truths. If you flip a coin and never check how it landed, it may be true that it landed heads, even if nobody has any way to tell. Truth is a matter of how things are, not how they can be shown to be.

So when we say that only true things can be known, we’re not saying anything about how anyone can access the truth.

The belief condition

The idea here is that you can only know what you believe.

Although initially it might seem obvious that knowing that p requires believing that p, a few philosophers have argued that knowledge without belief is indeed possible.

Take this example suggested by Colin Radford (1966). Suppose Albert is quizzed on English history. One of the questions is: “When did Queen Elizabeth die?”

Albert doesn’t think he knows, but answers the question correctly. Moreover, he gives correct answers to many other questions to which he didn’t think he knew the answer. Let us focus on Albert’s answer to the question about Elizabeth:

  • $E$ : “Elizabeth died in 1603.”

Well this is weird, Albert here is making an assertion about truth, without in fact believing that it’s correct, even though it turned out to be that he was right! Surely Albert doesn’t really know that’s when Queen Elizabeth died.

Radford makes the following two claims about this example:
  • Albert does not believe $E$

  • Albert knows $E$

Albert’s correct answer is not an expression of knowledge, perhaps because, given his subjective position, he does not have justification for believing $E$; he remembered an answer that happened to be correct. The justification condition is a key component of someone knowing something.

note: you could also argue that albert perhaps does believe $E$ but that feels like a weaker objection to me personally.

The justified belief condition

Why must a belief by justified? While we should always be able to justify our beliefs, Albert doens’t have a justification for believing the answer he gives in our previous example. He simply gives one that he remembers that happens to be correct in the end.

In addition, something could be a justified belief at one time and not justified the next. My favorite example of this is Copernicus. He wrote (not the first) a very famous paper about the nature of the earth in the universe. Before that discovery was made, you may have been justified in believing that the earth was the center of the universe, but the NEXT DAY you wouldn’t have been.

This is problematic though, because right now we have lots of theories about light, matter, math, chemistry and science but we can never truly be sure if the things that we currently believe are things that we know because we could potentially continue to discover new things in the future and find that by definition we didn’t know the things we thought we knew before.

Someone could believe they knew that the earth was flat their whole lives and have justification for that belief (check out the flat earth society) and if it was scientifically justifiable than they should be able to say that they knew the earth was flat.

Otherwise we could go our whole lives without ever really being able to “know” anything.

This is how we’ve ended up at the definition of “Justified True Belief”.

Here’s Why that doesn’t work.

Everyone was generally pretty happy with making statements about knowledge being true justified belief until we found that it didn’t cover conjunctive statements

Enter Edmund Gettier, a wonderful philosopher who’s published one of the shortest papers ever (literally two pages). He gives two great examples disproving the idea of knowledge we’ve been building up so far.

We’re going to talk more about the second one because I think it’s more useful and a more powerful example.

totally unnecessary backstory

Imagine two people, Smith and Jones, they’re both subpar accountants at an accounting firm in Boston. They’ve worked together at the same firm for a few years. Smith thinks that Jones has ridiculous views about modern architecture but other than that they respect each other.

Let us suppose that Smith has strong evidence for the following:

  • $f$: Jones owns a Ford.

Nothing crazy there, you’d definitely be justified in believing that someone owned a car if you saw them using it to drive to work every day for a few years.

To be more formal:

Smith’s evidence might be that Jones has at all times in the past within Smith’s memory owned a car, and always a Ford, and that Jones has just offered Smith a ride while driving a Ford.

Now imagine that Smith has another friend, Brown, (wow look at Mr. Popular over here. TWO FRIENDS.) who likes to travel. Smith doesn’t know anything about where brown might be on any given day.

Let’s say Smith selects three places at random, and constructs the following three propositions :

  • $g$: Either Jones owns a Ford, or Brown is in Boston;
  • $h$: Either Jones owns a Ford, or Brown is in Barcelona;
  • $i$: Either Jones owns a Ford, or Brown is in Brest-Litovsk.

First thing you might notice is that each of these propositions is entailed by $f$ which we talked about before. Smith is therefore completely justified in believing each of these three propositions. Smith, of course, has no idea where Brown is.

Now let’s say we find out two new pieces of information.

  • Jones does not own a Ford, but is at present driving a rented car.

  • Brown happens to be in Barcelona, making $h$ true.

This is an instance in which, proposition $h$ is true, but it doesn’t satisfy our definition of knowledge. This is because the part of the statement that makes it a justified belief is separated from the part of the statement that actually makes it true.

… shit.

This paper was a huge deal because for a while people thought this problem was pretty much dealt with. But now we’ve got to deal with this kind of a problem, these Gettier cases in which an assertion about the world can be justified by an ultimately false belief that simply happen to be true due to things unknown to those making the assertion.

Now it’s worth saying that Philosophy hasn’t come to an actual consensus on this issue! There are some crazy examples about dogs in a field and organized criminals setting up barn facades but we’re going to talk a little bit about one particular way to deal with the problem which is to add a new condition to the True Justified Belief conditions.

  • Safety : In all nearby worlds where $S$ believes that $p$, $p$ is not false.

The notion of safety here is one that describes how similar the state of affairs could be while still having the same result.

In a “nearby” world, Brown might have gone to Costa Rica instead (classic Brown). Our TJSB definition of knowledge enables us to not include this statement as knowledge because in nearby possible worlds Brown could have went anywhere, and $h$ is not going to be true in those nearby possible worlds.

if you’re still reading … thank you.

So this seems to be a pretty close to concrete definition of knowledge that does work for a lot of cases.

There is a refutation worth exploring that comes from Juan Comesaña who published this in 2005.

It’s a little unfair to say the matter is resolved given the vagueness of the “nearby” condition. In Comesaña’s example, the host of a Halloween party enlists Judy to direct guests to the party.

Judy’s instructions are to give everyone directions, but that if she sees Michael, the party will be moved to another location. (The host does not want Michael to find the party.) Suppose she never sees Michael, but some other person decides to wear the same costume that Michael was going to, then his belief on what the directions are, justified and based in Judy’s testimony, about the whereabouts of the party will be true.

Comesaña says they could easily have been false. (Had he merely made a slightly different choice about his costume, he would have been mistaken as Michael and deceived.) Comesaña describes the case as a counterexample to the safety condition on knowledge.

The idea here being that in a nearby world in which this guest wore a similar costume, he could be given a different justified true belief (the directions to the party) but that would not be knowledge because $p$ could in fact be false in a “nearby” world.

However, it is open to a safety theorist to argue that the relevant skeptical scenario, though possible and in some sense nearby, is not near enough in the relevant respect to falsify the safety condition. Such a theorist would, if she wanted the safety condition to deliver clear verdicts, face the task of articulating just what the relevant notion of similarity amounts to.

We’re now pretty much at the modern day, we haven’t achieved perfect consensus but it’s certainly really interesting to contemplate what it means to know something!

Sources

P.S.

On a totally unrelated note there is a fascinating paper even shorter than Gettier’s that I came across while doing research for this post, published by a clinical Psychologist in the Fall of 1974 titled; The unsuccessful self-treatment of “Writer’s Block”

Is the soul a physical part of you?

A metaphysical argument about the soul’s relation to the body.

Today we’re going to get into a small discussion about whether the soul is a part of you. Meaning; is your soul something that’s necessarily a physical part of you as a person?

To be more specific, this is an argument that was posed to posit the existence of the concept of “a soul” as an explanation for the phenomenon of human free will.

Note: If you have strong religious views on what the soul is and don’t enjoy questioning them, you may not enjoy reading this post. That being said, this post is not written to convince you of a particular point of view, but simply inform you about an interesting argument about the philosophical debate on the nature of the immaterial soul.

Laying the Groundwork

So before we can have this conversation, it may be useful to set a context for it. First, we have to talk about a couple of views about what people are before we can posit an argument in favor of one case or the other.

There are two prevailing views on this issue, there’s the dualist view, that a person is comprised of a body and a soul. Then there’s the physicalist view that a person is just a body.

Both of these views have interesting consequences, and the argument we’re going to discuss comes from the dualist perspective.

We need to define an important concept, the idea that a system can be deterministic. Meaning that if you reproduce an identical system multiple times, that a predictable outcome can be observed that follows specific rules. Most of physics uses determinism to predict how a system will unfold all the time. To quote google:

de·ter·min·ism (noun)

/dəˈtərməˌnizəm/

The doctrine that all events, including human action, are ultimately determined by causes external to the will. Some philosophers have taken determinism to imply that individual human beings have no free will and cannot be held morally responsible for their actions.

Let’s define another concept really quickly.

Free Will: the idea that if I am put into the exact same situation multiple times, I could make a different decision each time. There is nothing necessarily consistent about the actions of a being with free will.

So without further ado, the argument is the following.

There are three premises:

  • Humans have free will.

  • Nothing subject to determinism has free will.

  • All purely physical systems are subject to determinism.

Conclusion: Therefore, humans are not purely physical systems.

So to explain the concept of free will we appeal to the idea of a soul, something non-physical.

How should we evaluate this argument?

I think the first question that I asked when I heard this was “What is it about free will that proves that we’re not purely physical objects?”

Why would there be an incompatibility between the idea of having free will and the idea of things being determined?

But we’ll start our analysis with a couple of important questions.

Is this argument valid?

The first question we should ask is, “does our argument make sense?” Does this argument’s conclusions truly logically follow?

So I’ll save you some time and tell you that it’s conclusion does follow. Given these particular assumptions, it makes sense.

Are our assumptions valid?

Whether you realize it or not, this argument uses some serious assumptions, that we should be more rigorous about defining before we accept this radical conclusion.

premise one: This argument rests on the idea of free will. The idea that if we have free will we can’t just be purely physical systems. I think the strongest objection that someone could give would be that we don’t truly have free will. We might certainly believe we do; but free will isn’t something that can be observed. Perhaps the sense that we could have acted otherwise is an illusion.

premise three: “All purely physical systems are subject to determinism.” The statement asserts we can’t have free will and determinism. 3 is a pretty serious claim about empirical science.

Quantum mechanics as an example is certainly not deterministic. We can observe particles that might be found in one position $60%$ of the time; without knowing_ why_ they behave the way that they do. Determinism is not true at the level of quantum mechanics. This leads me to say that premise three might be false.

I should say that premise two can be challenged as well and I’d encourage you to think about why.

So I’ve told you the argument, I’ve told you what I think, let me know what you think~

Souls are … weird.

If you found this interesting, you may enjoy viewing the lecture series from Yale; Death (PHIL 176) : link here.

3D Printed Weapons; Explained

The History of 3D Printed Weapons in America

The liberator, a 3D printed handgun

What you’re looking at above is an image of one of the very first 3D printed weapons ever designed and distributed for free on the internet. To think that you could go on the internet and mass produce a gun or a rifle in your own home by printing them. This would have been unheard of even in 2012. So what’s really changed in the past couple of years?

There is a lot to understand here; so let’s start with the basics.

What is 3D printing?

if you already know about 3D printing, feel free to skip ahead.

3D printing is the ability to use special printers that layer plastics in order to create 3 dimensional objects. This has become a popular phenomenon in the tech community enabling a new and exciting avenue for creativity.

Warning: history incoming:

The earliest 3D printing technologies first became visible in the late 1980’s, at which time they were called Rapid Prototyping (RP) technologies. This is because the processes were originally conceived as a fast and more cost-effective method for creating prototypes for product development within industry. As an interesting aside, the very first patent application for RP technology was filed by a “Dr Kodama”, in Japan, in May 1980. Later, 3D Systems’ first commercial RP system, the SLA-1, was introduced in 1987 and following rigorous testing the first of these system was sold in 1988. 2007 was the year the shoots started to show through and this embryonic, open source 3D printing movement started to gain visibility.

But it wasn’t until January 2009 that the first commercially available and affordable 3D printer — in kit form and based on the RepRap concept — was offered for sale. This was the BfB RapMan 3D printer. Closely followed by Makerbot Industries in April the same year, the founders of which were heavily involved in the development of RepRap until they departed from the Open Source philosophy.

How it works…

The most common type of 3D printing is the process called Fused Deposition Modelling (FDM), and this technology is the primary method used to makenot just the handful of working 3D printed guns that have been made, but almost all 3D printing.

Yoda being 3D printed in layers

The process works by slicing a digital file containing the 3D model into dozens, sometimes hundreds of (usually) 0.1 mm thick layers. The 3D printer will then use robotic precision to replicate each of these individual slices on a printing bed using melted thermoplastics, one layer printed on top of another. The result, hopefully, will be a complete physical copy of the digital file. Depending on the size of the part, the type of material being used and the model of 3D printer, this process can take several hours.

How did 3D weapons come to exist?

Our story begins where most violence happens, Little rock, Arkansas. In the Summer of 2012 Ben Denio and Cody Wilson sought to create a political and legal vehicle for demonstrating and promoting the subversive potential of /publicly-available 3D Printing technologies. They called this venture “Defense Distributed”. On July 27th, 2012 they launched the Wiki Weapon Project- the effort to create and release the files for the world’s first printable handgun.

In August a month later, Indiegogo.com removed DD’s inaugural fundraising campaign from its website, citing a terms of service violation. This removal prompted a backlash in the tech community, and the Bitcoin community helped fund the Wiki Weapon. By September 2012, DD had raised enough money to start prototyping and experimentation, when industry player Stratasys (a popular manufacturer of 3D printers) scandalously revoked its lease with the company and quickly repossessed its printer. This repossession, one of the first of its kind, made world news, and the Wiki Weapon found even more support. Research and development continued. . .

The liberator parts printed and disassembled.

This, is the liberator.

These components are the brainchild of Defense Distributed’s vision of open sourced weapons available for the /public.

When the liberator was published, it did so as an open source software package, containing cad (computer-aided design) and .stl files that would allow you to print the individually designed components from your own home and reproduce the pistol.

This was interesting. This should be protected by both the First Amendment, since it involves expression of written information, and the Second (I’m thinking about the 2008 Heller decision) — It seems like it should be.

Except that it wasn’t. #Justice

A letter was sent from the federal government addressed to Wilson and Defense Distributed, It was signed by Glenn Smith, chief of enforcement for the State Department’s Bureau of Political-Military Affairs, Office of Defense Trade Control Compliance.

Smith warned Wilson that the technical specs he made /publicly available may be “ITAR-controlled technical data” released “without the required prior authorization” from the State Department. ITAR stands for International Traffic in Arms Regulations, which are the U.S. government’s set of rules controlling the import and export of munitions.

In other words, by releasing CAD files allowing anyone with access to a 3D printer to make a somewhat fragile plastic pistol, Wilson may have become an illegal arms trafficker. The State Department didn’t say for sure that this information (some might call it speech) fell under its jurisdiction. But while regulators pondered the question-and four months later, at press time, they were still pondering-they demanded that Wilson “treat the above technical data as ITAR-controlled,” meaning that “all such data should be removed from /public access immediately.”

So what we’re essentially seeing is that sharing code that can create guns is considered the same thing as exporting guns? That’s weird.

Waiting on the issue for a bit, Defense distributed decided they were going to take matters into their own hands and sue the federal government for the right to distribute files online pursuant to their constitutional right to free speech.

In a very interesting (but completely unsurprising) ruling; the courts ruled against Defense Distributed.

You can read the original court documents relating to the appeal here.

“Ordinarily, of course, the protection of constitutional rights would be the highest /public interest at issue in a case. That is not necessarily true here, however, because the State Department has asserted a very strong /public interest in national defense and national security. Indeed, the State Department’s stated interest in preventing foreign nationals — including all manner of enemies of this country — from obtaining technical data on how to produce weapons and weapon parts is not merely tangentially related to national defense and national security; it lies squarely within that interest.” — Defense distributed V. Department of State¹

However it should be noted that there was not consensus as it was a 2–1 decision. Here is a statement from 5th Circuit, District Judge Edith Jones.

“In sum, it is not at all clear that the State Department has any concern for the First Amendment rights of the American /public and press. Indeed, the State Department turns freedom of speech on its head by asserting, “The possibility that an Internet site could also be used to distribute the technical data domestically does not alter the analysis….” The Government bears the burden to show that its regulation is narrowly tailored to suit a compelling interest. It is not the /public’s burden to prove their right to discuss lawful, non-classified, non-restricted technical data.” — Defense distributed V. Department of State²

Where are we now?

If you’re interested, the liberator’s original source files can still be found on torrent sites.

Also, Defense Distributed has an interesting Instagram page so if you’re interested you can see the research and development that they do. Edit: They have since removed the instagram page; here’s their twitter though.

Here are some questions I don’t have the answers to yet but am curious about learning. Jump into the comments if you’re interested.
  • DD was not allowed to publish the liberator files due to the fear that foreign nationals might be able to obtain them, making DD an illegal arms trafficker. What would happen if the software in question continued to be leaked unto torrent sites due to being “hacked” by strangers that DD had no control over? Would they still be responsible for the files being shared on the internet?

  • Should the GNU GPL be revised in order to remove liability for any harm caused by GPL licensed source files that are open sourced weapons?

I hope this was interesting for you, it certainly was interesting to write and learn about, so as always thank you for reading ~

How Should We Model Reality?

Does machine learning justify changing our need for a glass box?

Take a look at this paper. It’s titled, “Deep learning and the Schrödinger equation”. It’s implications are really interesting; first let’s take a look at the paper’s claim.

“We have trained a deep (convolutional) neural network to predict the ground-state energy of an electron in four classes of confining two-dimensional electrostatic potentials. On randomly generated potentials, for which there is no analytic form for either the potential or the ground-state energy, the neural network model was able to predict the ground-state energy to within chemical accuracy, with a median absolute error of 1.49 mHa. We also investigate the performance of the model in predicting other quantities such as the kinetic energy and the first excited-state energy of random potentials. While we demonstrated this approach on a simple, tractable problem, the transferability and excellent performance of the resulting model suggests further applications of deep neural networks to problems of electronic structure.

So what does this mean in laymen’s terms? It basically means that we can give a computer a lot of different examples of any given problem we want it to solve and can get answers for questions about that problem that are really accurate.

Let’s say you were looking at some physical phenomenon you wanted to understand. (Let’s use something like the probability of finding a tennis ball that you dropped in a hole in the ground.)

Now in general if you drop a tennis ball in a big hole in the ground it usually falls towards the center where the hole is the deepest. So it’s unlikely that it will get caught on the outer fringes, but still possible.

So you might get some sort of curve of changing probability that looks like this, where x=0 is the center of the hole and the further out you go, the lower the probability gets.

ratchet probability example of how likely you are to find a tennis ball.

Now imagine instead of a tennis ball, you’re looking at an electron, and instead of gravity acting on the tennis ball, it’s now electric potentials that create coulomb forces on that electron that force it into a certain position.

This situation I’ve just described is called the quantum well; something you’ll see a lot in an introductory quantum mechanics course. The usefulness of the quantum well example is that you usually earn how to model it using the **Schrödinger equation**. $$ i\hbar {\frac {\partial }{\partial t}}\Psi (\mathbf {r} ,t)=\left[{\frac {-\hbar ^{2}}{2\mu }}\nabla ^{2}+V(\mathbf {r} ,t)\right]\Psi (\mathbf {r} ,t) $$

You can read more about the Schrödinger equation here, but for our purposes what you should know is that it gets very complicated very quickly. The second you start trying to use it to model two tennis balls, things become very very difficult to solve, especially because we only have the probabilities!

The glass box

When we manipulate equations that we believe accurately model the real world, every valid manipulation of that model is something that we also believe to be “true”, and we continue to manipulate this equation until we arrive at a statement of interest to the question we’re asking. These models are sometimes called a “glass box”. This is because the process we are inspecting is one that we believe that we understand because the model we’re using works for that particular problem.

In the context of this paper something called a Convolutional Neural Network is used. It works by looking at thousands of input-output examples of the thing you want to model, and creates an approximation of the raw numerical operations that are applied to your input in order to “fake” knowing whatever the function actually is. This process of creating a numerical model that doesn’t have to know anything about the function it’s approximating is called a “black box”. We know what we put into the box and we know what comes out, and that’s it. Because we don’t really know what the model looks like other than the numbers, we usually just test it to see how accurate it is and then use it for problems where we don’t have a glass box yet.

This concept of neural networks was actually invented in the 1950’s, but it’s only recently that this method of machine learning has exploded in popularity for solving complex problems. This paper is interesting because it enables us to use computers as black boxes to solve problems that were once thought to only be solved with glass boxes. Now we don’t need to understand a problem in order to be able to use an approximation of the answer. And before you say it, this has nothing to do with the hidden variables theory, which isn’t what we’re talking about. (That’s a question that Einstein)

Now we don’t need to understand a problem in order to be able to use an approximation of the answer.

Author image
  • David Awad

Which models are better?

Well just looking at it heuristically we get exact accuracy with glass boxes but they can take a long time to develop and accurately research.

On the other hand we can get black boxes that can give us a 0.1% margin of error and provide us with models for a lot of different problems a LOT faster because all we have to do is collect some data and then train a machine learning algorithm.

That brings up a more important question for us then, what’s more important? Is numerical accuracy all that matters? What is a model supposed to do for us?

I think the answer is that the model we choose in general entirely depends on the problem. Up until recently we’ve never really had the option to use anything other than a pure math, glass box solution to model any given situation to the best of our ability.

It is important to state though that this does have it’s limits. The predictions of these machine learning models are based solely upon our experimental data. So if there are unusual qualities we don’t know about, we won’t be able to notice until we hit edge cases.

For example, the ultraviolet catastrophe was a fault in the Rayleigh-Jeans law that was only uncovered because we could use test the model on much smaller wavelengths that we hadn’t used in experiment to see what it would predict. Only then did we find that it was unrealistic and needed to be revised.

So how should we model reality? Do we need to understand how a particular phenomenon works in order to make meaningful use of our predictions? Or can we ignore it and go on to use machine learning to speed up research and engineering at the cost of understanding.

The Engineer in my says who cares, but the physicist in me cares quite a lot and believes that there shouldn’t be a black box in our understanding of fundamental aspects of reality like schrodinger evolution.

Sources