WIRED.com is now entirely HTTPS. In other words: All our content is encrypted in transit from our servers to your browser, and this ensures no one is fiddling with that content before it reaches you.
We began this rollout nearly five months ago and took the final step of turning on HTTPS across the entire site last week. Now that we’ve reached this milestone, we wanted to share our experiences with you—just in case you want to move a media site to HTTPS.
Planning and preparing for moving to HTTPS was more of a human engineering than a technical engineering project. We started discussing the possibility of moving to HTTPS as far back as June 2015, with conversations intensifying toward the end of 2015. We coordinated closely with our ad teams—both the people who manage the technical and the operational aspects of ad delivery. We also worked with our SEO and business development teams as we evaluated risks associated with moving to HTTPS.
We decided that a staged rollout, where we convert one section at a time to HTTPS, would allow us to take on smaller amounts of risk at a time (a move that we cribbed from The Washington Post). With each section migration, we could evaluate the impact of the change.
Before encrypting the whole site, we converted individual sections. That covered approximately 10 percent of our content. We did these three migrations at three different times over the past four months. Additionally, we launched our brand new Video section during this process and deployed it with HTTPS from the start.
While we had a good plan for our rollout, an HTTPS migration of this nature is complex, and we ran into some issues along the way. If you’re considering a move to HTTPS, we have five basic recommendations:
- Deploy HTTPS on small sections before rolling it out side-wide to asses risk as you go, though this can complicate SEO
- Monitor mixed content issues via CSP reporting
- Use upgrade-insecure-requests (now in Chrome, Firefox, and Safari) to fix mixed content issues automatically for many users
- Use 301 redirects, update canonical URLs, and update your sitemaps to use HTTPS URLs for best SEO results
- Work with your teams that manage third party content to ensure that they are prepared to update their processes for managing HTTPS compliant content
If you want to learn more, read on.
On April 28, we began the rollout by migrating our Security section to HTTPS. And we learned a lot in just the first few hours.
Some content was inaccessible due to redirect loops. Previously, we redirected HTTPS requests to HTTP, and these redirects were cached. When we then turned on HTTP-to-HTTPS redirects, we created a redirect loop. To fix, we flushed URLs cached at our CDN for the affected URLs. Learning from this lesson, we prepared a “prelaunch” mode for future rollouts. Prelaunch mode allows URLs to be accessed at their HTTP or HTTPS URLs, without redirects being cached. Once we were sure no HTTPS-to-HTTP redirects were cached, we could turn on HTTP-to-HTTPS redirects without the dreaded redirect loop.
We also immediately identified content that was still being served over HTTP. We used Content Security Policy’s reporting mechanism to find HTTP content still served on the site. These pieces of HTTP content were being used on HTTPS pages, causing mixed content errors. Most of these were related to configuration changes between our staging and production environments that we simply missed. We were monitoring for these issues and were able to quickly spot and fix them.
When we started our rollout, we targeted May 12 as the day that we would turn on HTTPS across the site. Yep, we thought everything would be ready for prime time after merely two weeks. As I sit here writing this four months after our planned launch date, I cannot help but think how naive we were.
The primary issue was SEO. Our Security section appeared to take a rather catastrophic nose dive with regard to search referrals, search clicks and page rank. While changes in these metrics can certainly be attributed to other situational factors, it was pretty clear to us that this was related to the HTTPS switchover. We reviewed the situation to assess what we had done incorrectly.
The first change we made was to update our redirects from 302 to 301 redirects. In HTTP nomenclature, a “302 redirect” means that the redirect is only “temporary,” whereas a “301 redirect” means the redirect is “permanent.” We intentionally used 302 redirects as we thought this would allow us to easily roll back if necessary. It is possible that we confused search engines by failing to signal that the content has permanently moved.
We did correctly update the canonical URLs for each page. If the page was served over HTTPS, we updated its canonical URL to HTTPS. Given that canonical URLs have so much influence, we assumed that the redirects (even 302 redirects), plus proper canonical URLs would be sufficient to signal the real location of content.
It is incredibly difficult to understand what works and what does not work with regard to SEO. That said, I think we should have launched with 301—permanent redirects. However, I cannot say one way or another whether or not it would have made a difference. The feedback loop for SEO-related changes is so slow and influenced by so many factors that it is difficult to pin changes to any single cause.
When we evaluated our HTTPS rollout on our Security section, we also wanted to understand how well our ads performed in this two week phase. Our main concern with ads was discrepancies between first and third party reporting. First party reporting is the ad deliverability tracking conducted by our internal teams. Third party reporting is tracking that is implemented by other, non-WIRED, organizations. When our first party and third party tracking doesn’t match, it raises concerns about the state of the ad delivery. Fortunately, our analysis of this data suggested that we were seeing no concerning discrepancies after launching HTTPS.
Additionally, we looked at the number of mixed content issues that our visitors experienced. From this data, we were able to tell that our site triggered numerous mixed content issues. By examining this data, we were able to pinpoint categories of content that were still being delivered over HTTP. As you can probably guess, much of this content was from third parties. We were able to identify some systemic issues that we could address with our ad teams. Specifically, we found that delivery of non-visible HTTP assets were commonly overlooked in QA processes. Since these assets were not clearly breaking visual display of the ads, they would pass QA. We worked with our ad teams to look for these assets by using the Chrome Developer Tools Security Panel, which highlights insecure domains. Our ad teams later introduced automated tools to help spot mixed content issues.
Beyond examining individual mixed content issues, we wanted to assess if, as a company, we were getting better at delivering content over HTTPS. To this end, we calculated a daily ratio of mixed content errors to page views, which gives us a standardized value for judging progress that is not impacted by changes in page views. We saw a steady decline in this ratio in the first two weeks of HTTPS delivery, suggesting that our engineering and ad teams were effectively removing mixed content from the site over time.
Then we looked for general trends among mixed content errors. We noticed that 77% of the mixed content errors stemmed from Webkit, the browser engine used by Safari. This was due to Webkit not supporting the “upgrade-insecure-requests” directive for Content Security Policy. This directive instructs browsers to automatically use HTTPS to download assets, even if they are using the HTTP scheme. It fixes a broad array of mixed content errors without manual intervention.
After we published a story discussing how helpful Safari support for upgrade-insecure-requests would be, Apple began implementing the feature in Safari. Much to our delight, the Safari team has fully implemented “upgrade-insecure-requests” in Webkit, which is available in iOS 10 this week and macOS Sierra later this fall.
At this point, we thought that ad delivery was handled sufficiently. We also thought that we fixed our most egregious SEO mistakes and decided to move another section, Transportation, to HTTPS as a test. We certainly were not comfortable enough to move the whole site to HTTPS yet. We stated that we would aim to convert the rest of the site to HTTPS on May 24th.
After 12 days of the Transportation section on HTTPS and nearly a month of the Security section on HTTPS, SEO performance continued to be a concern. The Security section still was nowhere near recovering, and Transportation’s SEO performance was down. It was not nearly as bad as Security, but still bad enough that we did not have confidence in moving the rest of the site to HTTPS.
When you read about best practices for performing “site moves”—the term Google’s SEO guides use for changing a site to HTTPS—you will find estimates for SEO recovery to be from a few weeks to 2–3 months. Perhaps we just needed to wait longer to see the recovery? Maybe. They do not suggest SEO issues as bad as we experienced, so we still believed something else was amiss.
Examining our site once again, we reviewed our sitemaps, which we had not touched at all when moving the Security and Transportation sections to HTTPS. Best practices for handling sitemaps for a “partial site move” (i.e., for moving only part of a site to HTTPS) are not entirely clear. We were not able to find any literature discussing this specifically. As such, we thought that as long as the redirects and canonicals were in place, the search engines would figure it out. Because we were a month into this and still seeing SEO issues, we decided to follow best practices for a full site move with regard to sitemaps.
We decided to make a few changes to our sitemaps. To begin, the URLs in our primary sitemaps were updated to use the proper scheme where necessary: If a story had been converted to HTTPS, it now had the HTTPS URL in the sitemap. This change was pushed out on June 6th. Next, we created a second sitemap, that only included stories that were served over HTTPS. This sitemap was created and deployed on June 14.
During this period of time, we went silent. We wanted to update people following this story, but we didn’t have much to say. “We haz SEOz problems and we are trying to figure it out” was about all we could communicate—not much of a story.
At this time, we also began to engage others to further understand if we were doing anything wrong. We reached out to Google and asked them if there was a reason that we were seeing such terrible results. The initial response from Google was that we were not doing anything problematic with our SEO implementation and we should not expect to see any changes in search traffic after 3 to 4 weeks. We also studied the HTTPS rollouts of other media organizations and found no evidence of SEO problems on a similar scale that we experienced.
After this set of changes, we waited, hoping that we’d see the rebound we were promised. We also checked in on our mixed content issues during this time. We continued to see acceptable amounts of issues and a downward trend in the ratio of mixed content issues to page views. Our ad teams were getting into a rhythm with spotting and removing HTTP requests from ads before they ever made it to our site. At this point, we were quite confident in our ad delivery over HTTPS.
Ironically, ensuring appropriate ad delivery was the primary reason that we took a staged approach to the rollout. However, it seemed like this approach is what caused our SEO woes and ad issues were not as problematic as we anticipated.
Throughout the summer, we continued to monitor SEO impact. We hoped that by the 2 to 3 month mark we would be recovered. And in mid-August, we heard back from Google regarding our SEO issues. They indicated that their analysis showed that there was no significant loss caused by the HTTPS transition. They confirmed that there was an unspecified issue right after the initial Security transition and that we recovered from it quickly.
They emphasized that, because we are a news site, our ranking tends to change frequently due to the regular cycle of news. If we are not publishing anything that is making major news, we could see ranking fluctuations. As such, we cannot parse that out from the HTTPS transition. We were told that they saw no evidence that our HTTPS transition was causing declines in search traffic or clicks.
At this point, we decided that we would do one final test on our Design section. With all the potential mistakes corrected and assurance that our SEO will not be impacted, we moved our Design section to HTTPS and planned to wait 2 weeks. If SEO looked good, we would move to HTTPS everywhere.
The Whole Enchilada
On September 8, we rolled out HTTPS site-wide, after our search traffic analysis of the Design section suggested that the transition did not impact traffic. Our ratio of mixed content errors to page views continued to show a downward trend. Our results suggested that whatever problems we were previously suffering from had been properly mitigated, so we were comfortable moving to HTTPS everywhere. Similar to when we enabled HTTPS on the Security section, we found a few mixed content issues that we had missed. These were quickly mitigated, because we were closely watching our mixed content errors.
Other than those minor issues, the rollout went smoothly. We avoided the redirect issues that we previously experienced and were prepared to handle sitemaps more efficiently than we previously had.
Moving such a large site with so much content is a massive team effort. While it is easy to think of this project as purely technical, the real challenge was preparing our whole organization for what it means to deploy a website and its web of dependencies over HTTPS. It required a lot of people to be forced to change their processes, which is no small thing to ask. No one ever pushed back on this process without a good reason. There was an overwhelming willingness within WIRED to get this done.
Peter Elbaor, Pawan Sandhu, Christina Doehmer, and Robbie Sauerberg ensured that ad content would be delivered properly over HTTPS. They provided data to evaluate our success, reworked their quality assurance process to handle HTTPS, and worked with third parties to educate them about HTTPS. Our SEO team, composed of John Shehata and Ron Tumbokon, guided us through the SEO related setbacks we experienced. They leveraged their years of experiences to help us assess our issues and strategize fixes and tests. Toy Charassinvichai on the data team helped us build out a Content Security Policy violation report collector. This collector was arguably one of the most important technical contributions to the project.
Mark McClusky and Sam Baldwin prioritized the project from a product perspective and never let it fall off the roadmap regardless of the setbacks we faced. Scott Dadich, our Editor-in-Chief, never needed convincing that this was important and backed it from the earliest stages of the project. Finally, Kathleen Vignos, our former Director of Engineering, allowed for WIRED’s Tech Team to sink time into researching this project while pursuing an otherwise aggressive technical roadmap.
Large scale HTTPS transitions are massive undertakings. My biggest piece of advice when taking on this challenge as part of a media organization is to build your team before you build your tech. HTTPS deployments touch every single piece of your tech stack and business and you need people to advocate and problem solve every step of the way.
This article was syndicated from wired.com