A simple but definitive guide to Mach-Zehnder interferometers

This semester I took the leap and ventured into the lab (and have yet to break anything). One of the things I have been working with is a Mach-Zehnder interferometer. Generally speaking, a Mach-Zehnder interferometer — or MZI — is a fairly simple and straightforward device. But there were a few oddities about it that were bugging me and they turned into a semester-long obsession. Attempting to find literature that fully explained what was going on turned out to be incredibly difficult and no one I ran into seemed to really know (or they all had differing opinions). But at long last, I think I have figured it out.

Regarding the notation that I will be using, the following image depicts a beam splitter in which the blue beam is transmitted and the red beam is reflected. The reflected beam on the side with the dot picks up a phase shift of π radians.


Technically there could be a phase shift at the mirrors as well (depending on how they are constructed), but since both arms pick up the same shift from these mirrors, we can safely ignore any mirror effects. So the general setup that I focused on was the following fairly standard form:


I’ve given the two arms of the interferometer different colors just to distinguish them. I made the output beams purple just to indicate that they are some mixture of light from the two arms.

Quantum mechanically, we can model this in a fairly simple manner if we consider the input to be |0\rangle. The first beam splitter (which is 50:50) is given by \frac{1}{\sqrt{2}}\left(\begin{array}{cc}-1 & 1 \\ 1 & 1\end{array}\right) while the second beam splitter (which is also 50:50) is given by \frac{1}{\sqrt{2}}\left(\begin{array}{cc}1 & 1 \\ 1 & -1\end{array}\right). Together they are given by \left(\begin{array}{cc}0 & -1\\1 & 0\end{array}\right). As such, the output will be |1\rangle. This means that, quantum mechanically, nothing should appear at Output 1. Thus, when we send single photons through the device, they always arrive at Output 1. (See Schumacher and Westmoreland, Chapter 2, for an excellent discussion of this.)

If, however, we shine a bright laser through the MZI we actually see something like this (taken from my own setup — Output 2 is on the left and Output 1 is on the right):


I tossed in an extra mirror after Output 2 just so I could project the results onto the same screen. I also tacked on some lenses at the end just to blow up the pattern so you could see it. So, first of all, the obvious difference between this and the quantum case is that we now have photons reaching both outputs. This, of course, is inconsistent with the math we did above for the quantum case. The quantum result is not completely lost, however. If you look carefully, you will notice that the center of the interference pattern in Output 1 corresponds to a bright fringe whereas the center of the interference pattern corresponds to a dark fringe (note: it is wicked difficult to keep these things steady — the smallest movement, e.g. air conditioning, is enough to disturb it which is why MZIs are used in a number of practical situations). Also note that an interference pattern as shown above only appears if the MZI is set up in a perfect square (actually a rhombus, as we’ll see) and in the same plane. If it isn’t in a perfect square (and in the same plane), then you still see light at both outputs, but you don’t see an interference pattern.

So suppose we could very slowly crank up the laser intensity such that more and more photons began going through together. At what point would we start to see photons showing up at Output 2? More importantly, why do they start to show up there? Where does the interference pattern come from and why does it “preserve” some aspect of that quantum prediction? Numerous people have tossed out ideas here and there but the only one that was even close to correct was Nathan Wiebe with whom I discussed this at the APS March Meeting. Nathan suggested that decoherence had something to do with it. Of course, this is related to something Neil Bates has been trying to disprove for awhile now. I’m still not sure if I understand his argument so I can’t say for certain whether or not he is correct, but I can say that a certain type of decoherence definitely does have something to do with it. Credit Neil, however, with being the first person to alert me to the differences between spatial and temporal coherence in the beam (more on that later). Rather than give a detailed accounting of the different types of decoherence (both classical and quantum), I will instead simply explain what is happening and you can draw your own conclusions based on your understanding of the various types of decoherence.

So, first of all, if we model the beam as a continuous wave, the interesting thing is that by carefully keeping track of the phase shifts and combinations throughout the setup, we should get the same exact result as in the single-photon case. For example, the upper arm picks up a phase shift of π radians at the first beam splitter. At the second beam splitter, a portion of each beam is transmitted and a portion is reflected. Looking at Output 1, we have a combination of the reflected lower beam, which picks up a phase shift here of π radians since it is on the side with the dot, and the transmitted upper beam which already had a phase shift of π radians from the initial beam. So the phase shift on the reflected part of the lower beam has the effect of bringing the two beams back into phase with one another and we get perfectly constructive interference. Hence, we have light at Output 1. (Note that this implies that a single photon must travel through both arms simultaneously if we think of it as a wave packet!)

Looking at Output 2, however, the reflected portion of the upper beam, which combines with the transmitted portion of the lower beam, does not pick up a phase shift since it is not on the side with the dot! As such, the two beams are still out of phase by π radians and thus will destructively interfere meaning we should not see any light at Output 2. So clearly the so-called “quantum” prediction is is exactly the same as the so-called “classical” prediction, i.e. there’s only one prediction.

One possible explanation that I had set my sights on about a month ago had to do with the fact that the beam had a “width” to it which meant that not all parts of the beam were hitting the reflective portion of the beamsplitters in phase with one another. Notice, however, that regardless of where a particular part of the beam hits the reflecting part of the beamsplitter, it still forms a perfect square:


So while each part of the beam is out of phase with each other part, crucially they are never out of phase with themselves in such a way that the outputs flip. In other words, in every case you should still find light only at Output 1 (credit goes to our lab manager, Kathy Shartzer, for pointing that one out).

So then I figured that maybe it had something to do with the fact that the beam widens as it moves along (“beam spreading”), but if you perform the ray tracing as above, you will get a rhombus for the outer edges and if you keep track of the lengths and phases, it turns out you still should only get light at Output 1. Incidentally, this suggests that maybe it’s not that it has to be a perfect square, just a perfect rhombus. At any rate, it was at this point that I started to question the Law of Reflection (not to mention my sanity).

But then I started going back-and-forth between two books on optics: the classic one by Hecht and one on quantum optics by Fox (why did I not do this before?) and finally the light went on in my head (no pun intended). So here’s what’s happening.

First, I’ll address why there’s any light at Output 2 at all. When it finally occurred to me, it was a bit of a “well, duh” kind of moment. In order for the light to only appear at Output 1, the phases have to match up just as described above. But this means that the tolerances are very very small! For example, suppose that we add an extra length to the upper arm that gives it an additional phase shift of π radians. This would have the effect of sending all of the light to Output 2, now. For the 532 nm light I was working with, this merely corresponds to adding 266 nm to the length of the upper arm. So it’s pretty obvious that any slight deviations from an absolutely perfect correspondence between the lengths of the two legs will change the results. Since the mirror is not perfectly smooth and the beam has some width to it, it’s no surprise that this is nearly impossible (certainly in my lab).

But that only explains the presence of light at both outputs. Why is there an interference pattern, why does it only occur when we are very close to a perfect rhombus, and why does it somehow preserve the expected result in the center fringe of the pattern? The answer to that has to do with temporal decoherence. This is quantified by the coherence time \tau_{c} which is the time duration over which the phase remains stable. Coherence time is related to the spread of angular frequencies \Delta \omega in the beam by

\tau_{c}\approx\frac{1}{\Delta \omega}.

In other words, only a perfectly monochromatic beam is fully coherent, i.e. has an infinite coherence time. All realistic beams are only partially coherent because there is always some spread to the angular frequencies (and thus wavelength), i.e. they’re not truly monochromatic. To quote from Fox,

If we know the phase of the wave at some position z at time t_{1}, then the phase at the same position but at a different time t_{2} will be known with a high degree of certainty when |t_{2}-t_{1}|\ll\tau_{c}, and with a very low degree when |t_{2}-t_{1}|\gg\tau_{c}

A more convenient measure is the coherence length, L_{c}=c\tau_{c} where c is the speed of light. So another way to state the above is to say that if we know the phase of the wave at z_{1}, then the phase at the same time at z_{2} will only be known to a high degree of certainty if |z_{2}-z_{1}|\ll L_{c}. That means that in order to get the two arms to have just the right phase to produce an interference pattern, the difference in length between the two arms has to satisfy 2\Delta L\lesssim L_{c}. This explains why we need nearly a perfect square (or rhombus) to get an interference pattern and it makes it clear that any such pattern is related to the natural variability in the beam. Anything else will simply produce light at both outputs. The only way to get the actual predicted result of light only appearing at Output 1 is to either dial it down to single photons (since, I don’t think that a single photon has a coherence time associated with it, but I could be wrong) or to have a perfectly monochromatic beam. (Note that a more accurate description involves the first-order correlation function which includes an oscillating term that explains this rapid changing of the angular frequency.) Note that this relates to the interpretation of the single photon taking both paths simultaneously (see Fox, p. 302).

The question then becomes, why does the center of each output faithfully retain the information of the expected result and why, if we adjust the mirror angles, does the spacing between the fringes change? Actually, the center of the outputs will only retain the expected result if it is exactly a perfect square or some proper multiple of the phase as discussed above. This explains why sometimes I got the opposite of my expected result. It also explains why the pattern seemed to constantly be shifting (and did so especially when there were vibrations in the air or on the optical bench). The alternating pattern then results from the fact that the mirrors are likely not exactly at 45 degree angles (remember how insanely small the tolerances are). So, for example, if we had mirrors that were exactly at 45 degree angles, what we would likely see would be light flashing back and forth between the two outputs, but no interference fringes.

So the only open question that I see is: if we start with single photons and slowly crank up the intensity, at what point does the coherence time come into play, i.e. at what point does temporal decoherence kick in? I suspect the answer lies in photon bunching, but I’ll have to do some more reading and thinking and, eventually, experimenting…

32 thoughts on “A simple but definitive guide to Mach-Zehnder interferometers

Add yours

  1. I don’t think it’s the coherence length, since these are typically much larger than the tolerances that you describe (especially for a CW laser). The photo at the start of your article reminds me of Hermite-Gaussian beam profiles, so perhaps your laser is emitting in different modes. Not sure how that would explain your findings, though. You can send your laser through a small pinhole and use a lens to get a paraxial beam again. That way you select the lowest mode number (at the expense of intensity).

    As for the single photon coherence time: since all you’re doing is dialing down the intensity of the laser, the single photons* will have exactly the same mode structure as the original laser beam. In particular, the photons are multi-frequency wave packets with finite coherence lengths. The answer to your final question is that (assuming the setup is stable), if you send in photons one-by-one they slowly build up the pattern that you see in the picture. Sending individual photons will not make output 2 completely dark.

    *Of course, a laser does not emit “single photons” at any intensity, no matter how low. The quantum state is a superposition of 0, 1, 2,…etc., photons with increasingly lower likelihood of 1, 2, etc. I don’t think that is important here.

    1. Hi Pieter,

      Thanks for the reply. If I calculated it correctly, the coherence length of the laser I’m using should be very roughly 22.5 microns. Here’s the laser I’m using: http://www.thorlabs.com/newgrouppage9.cfm?objectgroup_id=5597.

      Regarding sending in photons one at a time, if they don’t make output 2 completely dark, how does that jive with the mathematical prediction, then?

      I’ll have to try the pinhole idea, but my gut tells me that’s not it because the interference pattern is clearly related in some way to the optics that are external to the laser itself.

      1. So, are you talking about the longitudinal coherence length of the laser or the transverse coherence length? Also, note that it is about 100 times larger than λ/2, so I still think it plays no role.

        Another effect that may be relevant is the inevitable slight divergence of the beam. I would not have thought it relevant here, but who knows. If you realign your MZI as a Michelson-Morley interferometer, will you have the same problem?

      2. The quantum description you give does not include all the gory details of the laser modes, the mirrors and the beam splitters. So it is not surprising that it does not show up there. It is a highly idealized description, and you have found that your setup is not ideal.

        A single photon in a MZI has all the characteristics of the laser beam. The only difference is that you can’t measure it all at once. You get only a single dot on the screen, instead of the full pattern that the laser produces. When you correctly calculate the probability of where the photon will hit the screen, it will be distributed like your picture.

      3. Regarding the coherence length, I’m actually talking about the temporal coherence, not the spatial coherence.

        As for the “quantum” description, it actually matches the idealized classical description. But either way, you are correct that it is idealized. What’s interesting is that the center of the interference patterns always corresponds to this idealized description.

  2. Ian, thanks for the writeup and the mention. I’ll get back with more later. Meanwhile, shoot me an email with any questions you have about my decoherence proposal. Cheers.

    PS: Pieter, it is possible to make single-photon emitters but I think you’re right about lasers: filtering just reduces the amplitude of the combined multi-photon state mixture.

    1. Hi Neil,

      Once I fully understand what’s going on here, I will then try to understand what your proposal is all about. I’m pretty sure that I wasn’t setting it up correctly at first (in the sense that I think I need to shorten the lengths of the various arms a bit to keep the beam from spreading too much, unless I try Pieter’s proposal with the pinhole).

  3. I concur with Pieter that you have to take into account spatial mode structure when analysing a MZI. You could get round this by using single-mode fibres. As it is, I think you haven’t quite aligned all the modes to interfere on the final beamsplitter.

    Temporal coherence is not an issue if the path lengths are equal. This can be demonstrated by setting up a Michelson-Morley interferometer (MMI) with equal path lengths and illuminated using a white light source. You can still see fringes, though they die out very quickly once you go just a little away from equal path length. I had to demonstrate this to undergrads so know just how fiddly it can be to set up an interferometer. Looking at the output of a MMI also demonstrates the importance of transverse beam mode structure. You see either rings or curved lines (more or less) due to the multimode structure of the input “beam” (actually a diffused light source) and the slightly different path lengths each takes. You actually use the curvature of the fringe pattern to help align the MMI and find the equal path length setting.

    To see that in a perfectly aligned, single mode MZI you still get interference regardless of the light source (e.g. single-photon, n-photon, chaotic, thermal, coherent etc.), you just need to see how the creation operators transform under the action of the beamsplitters and phase shifter. A creation operator corresponding to input 1 will end up being transformed into a creation operator corresponding to output mode 1 if the phases are correct.

    I have in mind a simple (or so I thought) experiment that uses an MZI as its main element but speaking to an experimental quantum optician soon disabused me of actually going into the lab to do it myself. Main issues were achieving high visibilities (I’d like 99.99%) and locking the MZI to a fringe (thermal drift, vibration etc.) whilst making measurements on the outputs. My simple setup soon sprouted piezo mirror mounts, lock-in amps and feedback servos, all in addition to the actual output analyser. I’m trying to outsource the experiment to someone who has much more experience than I :-).

  4. My student set up a Michelson interferometer this afternoon and two separate interference patterns were clearly visible. One was very clearly due to the internal modes of the laser since it was pretty much always visible (though it sort of “popped” when things were lined up just right). Specifically, it looked like the TEM20 mode.

    But when the lengths were about equal we got the same kind of pattern I was seeing in the MZI and you could still see the other pattern “underneath.” The thing is that the fringe pattern we’re seeing in both the MZI and the Michelson doesn’t resemble any cavity mode or even any combination of cavity modes I’ve ever seen.

    According to every source I’ve read, temporal coherence is precisely the source of the fringe pattern in multi-beam interference experiments. It’s supposedly due to the oscillatory part of the first-order correlation function.

    Daniel: regarding your experiment, what exactly were you trying to do? Vibration’s the tough one. You can minimize the thermal issues by doing it in a vacuum chamber. But, depending on the accuracy you’re aiming for, you might need not just dampers on the optical table but probably a ground floor that’s acoustically isolated from the rest of the building.

    1. I think the “solution” to the fringe locking issue was to use an oscillating drive on a piezo mirror mount to scan through the bottom of a fringe, track one of the outputs with a photodiode and then gate the measurements on the other port to a min/max of the fringe. For the first set of experiments, we could do it all with a single strong beam, but if we had to go to the single photon limit, I suppose one could use a second laser of a different frequency for path length stabilisation (like they do in some cavity experiments).

      What happens to your outputs when you remove the second beam splitter? Do you get two gaussian spots (one from each output) or a more complicated mode structure? Even if you have nice gaussian beams, if you overlap two gaussians with a slight transverse displacement and linearly varying relative phase profile (due to angular misalignment), I suspect you may get a similar sort of interference pattern.

  5. I don’t know why I didn’t mention this before, but another reason I don’t think it has anything to do with the modes is that fine adjustments of the mirrors changes the spacing between the fringes in a continuous manner. If it is mode-related, fine adjustments of the mirrors should not result in continuous changes in the fringe spacing.

    Removing the second beam splitter preserves what I am now positive is the cavity mode structure of the laser itself, but destroys the pattern I have been describing. The more I think about it, the more I’m convinced it has to be how I described it (besides, every reference I have read would be wrong…).

    Regarding the fringe locking, I assume that you’re talking about trying to hone in on one of the mode fringes.

  6. I think it’s to do with mode mismatch at the beamsplitter. On G+ I’ve shared some Mathematica plots of the result of two gaussian modes interfering at a beamsplitter but with transverse linear phase difference. Actually, I don’t think you need the two modes to be displaced, just that they are not perfectly aligned so the relative phase across their transverse mode profile is constant but linearly varying.

    This could be the result of the angles at which the modes intercept the beamsplitter not being equal. Adjusting the mirrors will change the tilt of the beams. If you align the mirrors so that the beams meet at one end of the beamsplitter, compared to the other, the relative phase across the modes will change, hence alter the fringe spacing.

    You should be able to work out the phase slope from the geometry, and the width of the beams and compare it with the fringe spacing you observe.

    1. I saw your plots on G+. As I noted there, I think we’re talking about something similar now. I think maybe what we’re getting hung up on is the word “mode.” Are you talking about the internal cavity modes of the laser itself or the modes in the “cavity” that is the MZI?

      1. I’m talking about the latter. For the purposes of analysis, one can just consider the electromagnetic field of two monochromatic beams in the paraxial approximation meeting at the final beamsplitter. One gets very similar output patterns by using gaussian beams but with slightly different incidence angles at the beamsplitter.

      2. I’m not going to be in my office until late next week sometime, but I’ll have to play around with it and see what happens. I’m still not convinced.

  7. Reminder, again – the essence of Chad Orzel’s thought experiment, and my variation of it, was to test the effects of varying of global phase for the MZI legs overall. IOW, for the phase between top and bottom leg to vary by one angle in “shot 1”, a different angle during “shot 2”, yet another angle for “shot 3” etc, randomly (each “shot” means one photon!) This variation in general phase angle from instance to instance is not to be confused with any variation in phase laterally across the bore (if I may borrow a term) of either MZI channel passage.

    And, Ian, what in particular are you “not convinced” of? I can’t really tell. tx.

    1. Hmm. I think I am not understanding your experiment, then. But the good news is that I just ordered a pack of photon detectors and they should arrive this summer sometime (I hope). Of course, that still leaves the task of truly creating single photons…

  8. Perhaps I should clarify this. I mean, in Chad’s and my expanded version, the phase difference between paths is randomly changed between each shot of a photon through the MZI. So for example, maybe the first shot has no alteration of relative phase: the same as your setup. (Hence first photon has 100% chance it will come out of Channel 1 (what you call Output 1. BTW you have a typo above, you wrote “quantum mechanically, nothing should appear at Output 1. Thus, when we send single photons through the device, they always arrive at Output 1.” Surely you meant, “nothing should appear at Output 2.”) But then our “confuser” will start inserting phase changes. So maybe the second photon encounters a phase shift (additional to any other) of 0.3 rad in the lower leg. That means that the superposition coming out of BS2 is different than before. The amplitudes no longer add to 100% from Output 1 and % from Output 2. Then for photon 3, maybe we have a shift of -0.2 rad, which makes for yet another set of net amplitude outputs from the Channels.

    After this goes on for awhile, the average effect is to wash out the ensemble average interference in BS2. The output will be 50/50 from Channels 1 and 2. The decoherence advocates say, this is like the situation being classical and sort of “explains” the particle-like behavior. Putting aside other criticisms, my proposal showed how to recover the evidence that amplitudes continue to come out of both Channels, and requires an asymmetrical BS1 (ie, unequal split.) BTW the best link about that proposal is at http://fqxi.org/community/forum/topic/949.

  9. An experimentalist here who’s built these a few times.

    I’d suggest that the fact that the fringe spacing changes with angle is a strong hint that your MZI is not perfectly aligned – the k-vectors from each arm must be exactly aligned and the beams be exactly centred. In other words, the two arms are not in the same mode after the 2nd beam splitter and so you will have imperfect alignment. As you change the angle you are effectively changing the relative phase of one arm that is projected along the beam from the other arm.

    Also, as a sanity check. Your coherence length – relating the temporal coherence to a distance using c – is almost certainly longer than 22um. At 532nm a laser linewidth of 1nm corresponds to a coherence length of 300um. The width of those lasers is typically less than 0.1nm. You could check this by building a Michelson and varying the path length difference, just as Daniel suggested.

    It is not at all critical that the interferometer is square. Ideally both arms are the same length, which works with any parallelogram. The hard part of this is in mode-matching the individual arms into the output. Prefect mode-matching of interferometers is actually very difficult!

    1. Awesome, an experimentalist! Thanks for the comment! Regarding my coherence length calculation, the linewidth was, I thought, 2 nm. I’ll have to run my calculation again.

      At any rate, as I’m about to put up in another post, over the weekend I was messing around with it and figured out that I was most definitely seeing the cavity modes. It was odd, though, since it was so different from what I was seeing before.

      I have come to the determination that books on optics, notably MZIs, are full of crap.

  10. Can anyone explain me how the fringe pattern is formed in the MZ interferometer? Because, I really can’t find any path length difference created when the MZI is set up. And in the book by Hecht, I found its due to the tilt of the mirror. I really don’t understand how the path length difference is created. Is it that the fringes are formed exactly how it is formed in MIchelson’s Interferometer? Can anyone help me find this answer please?

    1. That’s actually what I’m trying to figure out myself. I don’t think Hecht is quite right on this. I set up a Michelson a few weeks ago and the pattern was very clearly a result of the internal mode of the laser. I was getting similar results with the MZI, but then had some weird stuff happen (as documented above). There’s a good book on Quantum Optics by a guy named Mark Fox from Oxford U. Press that discusses some of this. I’ll be away from my office and lab for awhile so I won’t be fiddling with this, but I’m hoping to get back into playing around with it later in the summer at which point I’d like to try to settle this question. So stay tuned!

  11. From a wave-model point of view (not photons), the axis of the beam that goes through the final beamsplitter and the axis of the beam that is reflected by the final beamsplitter are not exactly collinear – they will not emerge from the same point of the last beamsplitter. The minute angle between the axes of these beams makes an interference pattern because the wavefronts in one beam will not be quite parallel to the wavefronts in the other beam. The pattern from a Mach-Zender interferometer exists for the same reason as the pattern from a Michelson interferometer. The only difference is the math required to identify where the fringes will be. The fact that the phase is not constant along a plane cross section through either beam only has a small effect on the exact location of the fringes; the relative inclination of the two beams plays a larger role regarding where the fringes will be found.

    1. Right, but why does the center of the pattern exactly match the quantum result? In other words, what I’m curious about is the relationship between the classical and quantum results. And I honestly don’t know the answer to this so if you do, I’d love to hear about it.

  12. I guess I should add that if you eliminate the inclination between the two emerging sets of beams (so the wave-normals coincide along the axis of the beams), then the phase variation across the cross section of the beam (the deviation of the wavefronts from being ideally planar) WILL dominate the interfernce mechanism (as it does for the “zeroth order” bull’s eye type pattern. (The inclination of the beams is the chief mechanism for high order parallel line fringes).

    1. Well, again, it depends on where you live. You can buy a pre-made MZI, though not for $1000. Honestly, it would be pretty difficult to make a good MZI for $1000.

    1. Thanks for the links. That’s pretty impressive! We have a pretty nice setup, but it’s not made out of Legos and Legos are much cooler. 😀

Leave a Reply to quantummoxie Cancel reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: