Predicting relationship success using Facebook data
Facebook recently released an analysis of relationships on their site. One graphic was particularly interesting to me. It shows the percent of relationships that last 3 months that go on to last any given number of months into the future. Here’s a copy of that graph.
Now, there are a couple of caveats here.
- I don’t have the raw data points, I can only estimate them from the graph.
- Facebook hasn’t been around that long. Forecasting future stability from this data is inherently risky. It seems likely that relationships could reach a point where they’re more stable than this trend predicts - say around the 10 year mark.
- Facebook didn’t provide standard deviations for any of its data points, making it likely that this data (even with the previous caveats) is a poor predictor for individual relationships.
That said, I’ve estimated the data from the graph and plotted it. There’s a remarkable trend.
Note just how well the data fits the curve (R² = 0.9993). That suggests that within the first few years of a relationship, stability (among Facebook users) is remarkably predictable.
Interestingly, I suspected that the 3-6 month range that Facebook provides in its original graph would be a clear outlier because of instability in the early stages of relationships. (Facebook leaves out the first 3 months for this very reason.) The trend line provided above doesn’t use the first three months. Again, the fit is remarkably good.
We can use this data to predict the odds that you’ve found “true love” (i.e. that you’ll never break up with your current partner).
Now, let’s figure out what this means.
Now, if you’ve been dating more than 3 months, you can find yourself on this chart. Find the number of months you’ve been dating on the chart (if you don’t know, ask your partner). You could be anywhere along the vertical line that denotes the length of time you’ve been dating. If you’re in the “bad zone”, that means you’ll eventually break up. If you’re in the “good zone”, you won’t (at least within the 20 year window of this projection). Obviously, there’s no way to tell at this point in time which side you’re on.
But you can make a prediction (given all the caveats I’ve already mentioned). Let’s say you’re at the 18th month in your relationship. Among the group of Facebook users at this point, some are in the good group, others in the bad. We can calculate the proportion of green group members to red group members. At 18 months, the average relationship has about a 48% chance of long-term survival.
Suppose you want to wait until you’re 50% sure that your relationship is long-term viable before you decide to get married (still only 50%!); you’ll have to wait until you’ve been dating 21 months for that. If you want to be 66% sure, you’ll have to wait until month 56! (The scythe of time eats relationship more slowly over time in this model.) You can calculate your own percentage with the following formula:
Just replace MONTHS with the number of months you’ve been dating, and plug it in Wolfram Alpha.
Conclusions: Relationships are hard.
But really, if you make relationship decisions based on this data, you’re probably crazy. It’s important to remember that the calculations above assume the model is accurate for 20 years. In reality, I suspect that this model over-predicts breakups and most relationships converge towards long term stability after enough years. So the “survival” percentages may be slightly higher than this model predicts.
Additionally, the probabilistic nature of this analysis is important to notice. Your relationship isn’t preassigned into a green or red group. Nothing in this study undercuts the importance of good communication and making sure you and your partner are on the same page. And love - that’s important too. So get off the Internet, and go spend time with someone special.
Postscript: aside from the aforementioned, this post contains two simplifications.
First, it’s possible Facebook’s chart is itself a mathematical model, not the data itself, which would make the excellent fit of the trendline uninformative.
Second, the 20 year (240 month) estimate is trying to find a middle ground between two different weights on future data. As already mentioned, it’s likely that relationships stabilize more than the data predicts. On the other hand, people certainly do break up or get divorced after 20+ year relationships. The trend line converges to 0 at infinity, which isn’t accurate (we don’t live forever, for one thing). So given the lack of better data, we have to choose a year to balance between both issues. I’ve chosen the fairly ad hoc 20 year point, but perhaps a better one could be found.