A simple but definitive guide to Mach-Zehnder interferometers
This semester I took the leap and ventured into the lab (and have yet to break anything). One of the things I have been working with is a Mach-Zehnder interferometer. Generally speaking, a Mach-Zehnder interferometer — or MZI — is a fairly simple and straightforward device. But there were a few oddities about it that were bugging me and they turned into a semester-long obsession. Attempting to find literature that fully explained what was going on turned out to be incredibly difficult and no one I ran into seemed to really know (or they all had differing opinions). But at long last, I think I have figured it out.
Regarding the notation that I will be using, the following image depicts a beam splitter in which the blue beam is transmitted and the red beam is reflected. The reflected beam on the side with the dot picks up a phase shift of π radians.
Technically there could be a phase shift at the mirrors as well (depending on how they are constructed), but since both arms pick up the same shift from these mirrors, we can safely ignore any mirror effects. So the general setup that I focused on was the following fairly standard form:
I’ve given the two arms of the interferometer different colors just to distinguish them. I made the output beams purple just to indicate that they are some mixture of light from the two arms.
Quantum mechanically, we can model this in a fairly simple manner if we consider the input to be . The first beam splitter (which is 50:50) is given by while the second beam splitter (which is also 50:50) is given by . Together they are given by . As such, the output will be . This means that, quantum mechanically, nothing should appear at Output 1. Thus, when we send single photons through the device, they always arrive at Output 1. (See Schumacher and Westmoreland, Chapter 2, for an excellent discussion of this.)
If, however, we shine a bright laser through the MZI we actually see something like this (taken from my own setup — Output 2 is on the left and Output 1 is on the right):
I tossed in an extra mirror after Output 2 just so I could project the results onto the same screen. I also tacked on some lenses at the end just to blow up the pattern so you could see it. So, first of all, the obvious difference between this and the quantum case is that we now have photons reaching both outputs. This, of course, is inconsistent with the math we did above for the quantum case. The quantum result is not completely lost, however. If you look carefully, you will notice that the center of the interference pattern in Output 1 corresponds to a bright fringe whereas the center of the interference pattern corresponds to a dark fringe (note: it is wicked difficult to keep these things steady — the smallest movement, e.g. air conditioning, is enough to disturb it which is why MZIs are used in a number of practical situations). Also note that an interference pattern as shown above only appears if the MZI is set up in a perfect square (actually a rhombus, as we’ll see) and in the same plane. If it isn’t in a perfect square (and in the same plane), then you still see light at both outputs, but you don’t see an interference pattern.
So suppose we could very slowly crank up the laser intensity such that more and more photons began going through together. At what point would we start to see photons showing up at Output 2? More importantly, why do they start to show up there? Where does the interference pattern come from and why does it “preserve” some aspect of that quantum prediction? Numerous people have tossed out ideas here and there but the only one that was even close to correct was Nathan Wiebe with whom I discussed this at the APS March Meeting. Nathan suggested that decoherence had something to do with it. Of course, this is related to something Neil Bates has been trying to disprove for awhile now. I’m still not sure if I understand his argument so I can’t say for certain whether or not he is correct, but I can say that a certain type of decoherence definitely does have something to do with it. Credit Neil, however, with being the first person to alert me to the differences between spatial and temporal coherence in the beam (more on that later). Rather than give a detailed accounting of the different types of decoherence (both classical and quantum), I will instead simply explain what is happening and you can draw your own conclusions based on your understanding of the various types of decoherence.
So, first of all, if we model the beam as a continuous wave, the interesting thing is that by carefully keeping track of the phase shifts and combinations throughout the setup, we should get the same exact result as in the single-photon case. For example, the upper arm picks up a phase shift of π radians at the first beam splitter. At the second beam splitter, a portion of each beam is transmitted and a portion is reflected. Looking at Output 1, we have a combination of the reflected lower beam, which picks up a phase shift here of π radians since it is on the side with the dot, and the transmitted upper beam which already had a phase shift of π radians from the initial beam. So the phase shift on the reflected part of the lower beam has the effect of bringing the two beams back into phase with one another and we get perfectly constructive interference. Hence, we have light at Output 1. (Note that this implies that a single photon must travel through both arms simultaneously if we think of it as a wave packet!)
Looking at Output 2, however, the reflected portion of the upper beam, which combines with the transmitted portion of the lower beam, does not pick up a phase shift since it is not on the side with the dot! As such, the two beams are still out of phase by π radians and thus will destructively interfere meaning we should not see any light at Output 2. So clearly the so-called “quantum” prediction is is exactly the same as the so-called “classical” prediction, i.e. there’s only one prediction.
One possible explanation that I had set my sights on about a month ago had to do with the fact that the beam had a “width” to it which meant that not all parts of the beam were hitting the reflective portion of the beamsplitters in phase with one another. Notice, however, that regardless of where a particular part of the beam hits the reflecting part of the beamsplitter, it still forms a perfect square:
So while each part of the beam is out of phase with each other part, crucially they are never out of phase with themselves in such a way that the outputs flip. In other words, in every case you should still find light only at Output 1 (credit goes to our lab manager, Kathy Shartzer, for pointing that one out).
So then I figured that maybe it had something to do with the fact that the beam widens as it moves along (“beam spreading”), but if you perform the ray tracing as above, you will get a rhombus for the outer edges and if you keep track of the lengths and phases, it turns out you still should only get light at Output 1. Incidentally, this suggests that maybe it’s not that it has to be a perfect square, just a perfect rhombus. At any rate, it was at this point that I started to question the Law of Reflection (not to mention my sanity).
But then I started going back-and-forth between two books on optics: the classic one by Hecht and one on quantum optics by Fox (why did I not do this before?) and finally the light went on in my head (no pun intended). So here’s what’s happening.
First, I’ll address why there’s any light at Output 2 at all. When it finally occurred to me, it was a bit of a “well, duh” kind of moment. In order for the light to only appear at Output 1, the phases have to match up just as described above. But this means that the tolerances are very very small! For example, suppose that we add an extra length to the upper arm that gives it an additional phase shift of π radians. This would have the effect of sending all of the light to Output 2, now. For the 532 nm light I was working with, this merely corresponds to adding 266 nm to the length of the upper arm. So it’s pretty obvious that any slight deviations from an absolutely perfect correspondence between the lengths of the two legs will change the results. Since the mirror is not perfectly smooth and the beam has some width to it, it’s no surprise that this is nearly impossible (certainly in my lab).
But that only explains the presence of light at both outputs. Why is there an interference pattern, why does it only occur when we are very close to a perfect rhombus, and why does it somehow preserve the expected result in the center fringe of the pattern? The answer to that has to do with temporal decoherence. This is quantified by the coherence time which is the time duration over which the phase remains stable. Coherence time is related to the spread of angular frequencies in the beam by
In other words, only a perfectly monochromatic beam is fully coherent, i.e. has an infinite coherence time. All realistic beams are only partially coherent because there is always some spread to the angular frequencies (and thus wavelength), i.e. they’re not truly monochromatic. To quote from Fox,
If we know the phase of the wave at some position at time , then the phase at the same position but at a different time will be known with a high degree of certainty when , and with a very low degree when
A more convenient measure is the coherence length, where c is the speed of light. So another way to state the above is to say that if we know the phase of the wave at , then the phase at the same time at will only be known to a high degree of certainty if . That means that in order to get the two arms to have just the right phase to produce an interference pattern, the difference in length between the two arms has to satisfy . This explains why we need nearly a perfect square (or rhombus) to get an interference pattern and it makes it clear that any such pattern is related to the natural variability in the beam. Anything else will simply produce light at both outputs. The only way to get the actual predicted result of light only appearing at Output 1 is to either dial it down to single photons (since, I don’t think that a single photon has a coherence time associated with it, but I could be wrong) or to have a perfectly monochromatic beam. (Note that a more accurate description involves the first-order correlation function which includes an oscillating term that explains this rapid changing of the angular frequency.) Note that this relates to the interpretation of the single photon taking both paths simultaneously (see Fox, p. 302).
The question then becomes, why does the center of each output faithfully retain the information of the expected result and why, if we adjust the mirror angles, does the spacing between the fringes change? Actually, the center of the outputs will only retain the expected result if it is exactly a perfect square or some proper multiple of the phase as discussed above. This explains why sometimes I got the opposite of my expected result. It also explains why the pattern seemed to constantly be shifting (and did so especially when there were vibrations in the air or on the optical bench). The alternating pattern then results from the fact that the mirrors are likely not exactly at 45 degree angles (remember how insanely small the tolerances are). So, for example, if we had mirrors that were exactly at 45 degree angles, what we would likely see would be light flashing back and forth between the two outputs, but no interference fringes.
So the only open question that I see is: if we start with single photons and slowly crank up the intensity, at what point does the coherence time come into play, i.e. at what point does temporal decoherence kick in? I suspect the answer lies in photon bunching, but I’ll have to do some more reading and thinking and, eventually, experimenting…