{"id":11775,"date":"2020-04-14T09:35:53","date_gmt":"2020-04-14T09:35:53","guid":{"rendered":"https:\/\/slackhq.com\/engineering\/?p=11775"},"modified":"2023-11-21T22:31:45","modified_gmt":"2023-11-21T22:31:45","slug":"hacklang-at-slack-a-better-php","status":"publish","type":"post","link":"https:\/\/slack.engineering\/hacklang-at-slack-a-better-php\/","title":{"rendered":"Hacklang at Slack: A Better PHP"},"content":{"rendered":"<p>Slack launched in 2014 with a PHP 5 backend. Along with\u00a0<a href=\"https:\/\/blog.box.com\/going-forward-faster-hhvm\" target=\"_blank\" rel=\"noopener noreferrer\">several<\/a>\u00a0<a href=\"https:\/\/blog.wikimedia.org\/2014\/12\/29\/how-we-made-editing-wikipedia-twice-as-fast\/\" target=\"_blank\" rel=\"noopener noreferrer\">other<\/a>\u00a0<a href=\"https:\/\/codeascraft.com\/2015\/04\/06\/experimenting-with-hhvm-at-etsy\/\" target=\"_blank\" rel=\"noopener noreferrer\">companies<\/a>, we switched to\u00a0<a href=\"https:\/\/hhvm.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">HHVM<\/a>\u00a0in 2016 because it ran our PHP code faster. We stayed with HHVM because it offers an entirely new language:\u00a0<a href=\"http:\/\/hacklang.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hack<\/a>\u00a0(searchable as Hacklang).<\/p>\n<p>Hack makes our developers faster by improving productivity through better tooling. Hack began as a superset of PHP, retaining its\u00a0<a href=\"\/taking-php-seriously\" target=\"_blank\" rel=\"noopener noreferrer\">best parts<\/a>\u00a0like the edit-refresh workflow and request-oriented memory model that enable speedy development. In addition to a number of quality-of-life improvements, Hack adds a better type system and a static type checker, which help catch bugs and allow developers to code and refactor with more confidence.<\/p>\n<p>In this post we\u2019ll talk about how and why we migrated to Hack, the benefits it gave us, and things to consider for your own codebase.<\/p>\n<h2>Static type checking is a game changer<\/h2>\n<p>PHP\u2019s type system has come a long way\u00a0<a href=\"https:\/\/www.php.net\/manual\/en\/migration70.new-features.php\" target=\"_blank\" rel=\"noopener noreferrer\">since PHP 5<\/a>, when it was not possible to annotate return types,\u00a0<a href=\"https:\/\/wiki.php.net\/rfc\/typed_properties_v2\" target=\"_blank\" rel=\"noopener noreferrer\">class properties<\/a>, or scalar types. Its remaining holes, like the lack of\u00a0<a href=\"https:\/\/wiki.php.net\/rfc\/generics\" target=\"_blank\" rel=\"noopener noreferrer\">generics<\/a>, may be resolved in the future. But its biggest flaw is that types are only checked at runtime. This is the most costly time to find out about type-related bugs, either by breaking a test suite or worse \u2014 a user report or production error log.<\/p>\n<p>With Hack, type checking happens\u00a0<strong>statically<\/strong>\u00a0(without running the code) and\u00a0<a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=pranayagarwal.vscode-hack\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>as you type<\/strong><\/a><em>.\u00a0<\/em>Change the signature of a function with hundreds of call sites, and you\u2019ll see errors for the ones that need updating before even hitting save.<\/p>\n<p>This is a\u00a0<strong><em>game changer<\/em><\/strong><em>\u00a0<\/em>for productivity \u2014 the difference between finding a bug milliseconds after typing compared to waiting for a comprehensive test suite (or finding out after deploying) is hard to overstate. It\u2019s akin to the productivity difference between developing websites in PHP vs. C. With Hack, you don\u2019t bother trying to run the code until the type checker is passing, and by then it usually\u00a0<em>just works<\/em>. This allows Slack developers to build, and refactor, with confidence, focusing testing efforts on higher value areas like\u00a0<strong>logic bugs<\/strong>\u00a0which static typing can\u2019t help prevent.<\/p>\n<p>Static type checking is possible in PHP with\u00a0<a href=\"https:\/\/psalm.dev\/\" target=\"_blank\" rel=\"noopener noreferrer\">community<\/a>\u00a0<a href=\"https:\/\/github.com\/phpstan\/phpstan\" target=\"_blank\" rel=\"noopener noreferrer\">packages<\/a>, and if you\u2019re using PHP I\u2019d strongly recommend using one of these. However, Hack\u2019s type checker has the advantage of a much more full-featured type system to work with. Hack is built from the ground up to enable static type checking, with features PHP lacks like\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/generics\/some-basics\" target=\"_blank\" rel=\"noopener noreferrer\">generics<\/a>,\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/built-in-types\/shapes\" target=\"_blank\" rel=\"noopener noreferrer\">shapes<\/a>,\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/built-in-types\/enumerated-types\" target=\"_blank\" rel=\"noopener noreferrer\">enums<\/a>,\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/built-in-types\/arrays\" target=\"_blank\" rel=\"noopener noreferrer\">hack arrays<\/a>, and a well-typed\u00a0<a href=\"https:\/\/docs.hhvm.com\/hsl\/reference\/\" target=\"_blank\" rel=\"noopener noreferrer\">standard library<\/a>\u00a0to enable rigorous static analysis.<\/p>\n<h2>Gradual typing enables a migration<\/h2>\n<p>We started with Hack in\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/source-code-fundamentals\/program-structure#legacy-files\" target=\"_blank\" rel=\"noopener noreferrer\">partial mode<\/a>, which treats all untyped values as the \u201cany\u201d type, usable for any purpose. TypeScript takes the\u00a0<a href=\"https:\/\/www.typescriptlang.org\/docs\/handbook\/migrating-from-javascript.html#moving-to-typescript-files\" target=\"_blank\" rel=\"noopener noreferrer\">same approach<\/a>. This enabled an incremental migration\u2014adding types over time. As files became fully typed, we changed them to the default\u00a0<em>strict<\/em>\u00a0mode so that they stayed that way.<\/p>\n<p>Surprisingly, gradually adding types to a weakly-typed codebase made me\u00a0<strong>more thoughtful about type safety<\/strong>\u00a0than I ever was working in strongly-typed languages like Java or Go. Instead of a requirement to get the compiler to run, types were a conscious decision to add value to the codebase. We had to justify spending time adding types by observing how they changed our working lives. Some parts of the codebase were easy to type, but others required refactoring to enable type safety.<\/p>\n<p>Not only have we found and prevented bugs, but types serve as a form of in-line documentation that are\u00a0<strong>verifiable<\/strong>\u00a0(unlike comment blocks), helping everyone read and understand the code. They also serve as a\u00a0<strong>contract<\/strong>\u00a0between different parts of the codebase. This has been crucial to productivity in a large, shared codebase like Slack\u2019s backend.<\/p>\n<p>Hack\u2019s type system has one feature in particular, Shapes, that caught on like wildfire at Slack, and I believe it\u2019s the reason we never looked back once we introduced Hack to our codebase.<\/p>\n<h2>Shapes help represent complex structures<\/h2>\n<p>PHP\u2019s\u00a0<code>array<\/code>\u00a0type, bewilderingly, can act as both a list (an ordered set of values) and a map (a set of key value pairs) at the same time. Most programming languages use separate types for these. In my experience, this is an endless source of bugs in PHP code, especially as functions like\u00a0<a href=\"https:\/\/www.php.net\/manual\/en\/function.array-merge.php\" target=\"_blank\" rel=\"noopener noreferrer\"><code>array_merge<\/code><\/a>\u00a0treat list-like and map-like arrays differently.<\/p>\n<p>Hack improves upon this by separating these into\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/built-in-types\/arrays\" target=\"_blank\" rel=\"noopener noreferrer\">different types\u00a0<\/a>and using\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/generics\/introduction\" target=\"_blank\" rel=\"noopener noreferrer\">generics<\/a>\u00a0to describe the types of their keys and values. A list-like array containing strings is a\u00a0<code>vec&lt;string&gt;<\/code>, and a map-like array with string keys and integer values is a\u00a0<code>dict&lt;string, int&gt;<\/code>.<\/p>\n<h2>But what about dicts that contain multiple types?<\/h2>\n<p><code>dict&lt;string, mixed&gt;<\/code>\u00a0is a valid, but not particularly useful type annotation, which says the dict contains\u00a0<code>string<\/code>\u00a0keys and values of\u00a0<em>any<\/em>\u00a0type.<\/p>\n<p>Enter\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/built-in-types\/shapes\" target=\"_blank\" rel=\"noopener noreferrer\">shapes<\/a>. A\u00a0<code>shape<\/code>\u00a0is an array that contains known keys with specific types. Keys may be optional if preceded by\u00a0<code>?<\/code>. These example shape definitions represent the arguments of an http POST request, which has many optional fields:<\/p>\n<pre><code class=\"language-php\">type http_post_options = shape(\n  ?&#039;timeout&#039; =&gt; int,\n  ?&#039;port&#039; =&gt; int,\n  ?&#039;http_basic_auth&#039; =&gt; string,\n  ?&#039;headers&#039; =&gt; dict&lt;string, string&gt;,\n  ?&#039;form_data&#039; =&gt; dict&lt;string, string&gt;,\n  ?&#039;json_payload&#039; =&gt; JsonSerializable,\n  ?&#039;user_agent&#039; =&gt; string,\n  ?&#039;follow_redirects&#039; =&gt; bool,\n);<\/code><\/pre>\n<p>This function signature uses that shape to type the\u00a0<code>$options<\/code>:<\/p>\n<pre><code class=\"language-php\">function http_post(\n  string $url,\n  http_post_options $options\n): http_response {\n  \/\/ ... implementation here\n}<\/code><\/pre>\n<p>A call site might look like this:<\/p>\n<pre><code class=\"language-php\">$result = http_post(&#039;https:\/\/example.com&#039;, shape(\n  &#039;timeout&#039; =&gt; 10,\n  &#039;form_data&#039; =&gt; &lt;em&gt;dict&lt;\/em&gt;[&#039;example&#039; =&gt; &#039;test&#039;],\n));<\/code><\/pre>\n<p>Not only does this help ensure the correct types are used for each field, it also helps prevent typos for the names of keys both in the call site and in the function body where the shape is accessed. This makes the shape much more impactful to developer productivity than a simple\u00a0<code>array<\/code>\u00a0type annotation. Before shapes, assembling a call to such a function would require reading its body or a large doc block (which may not be fully up to date) to understand the names, expected types, and \u201coptional vs. required\u201d status for each argument.<\/p>\n<p>Shapes are used for a variety of use cases at Slack, including:<\/p>\n<ul>\n<li>Database rows (code-generated shapes directly from DB schema)<\/li>\n<li>Expected results of decoding JSON payloads<\/li>\n<li>Functions with many optional arguments, as above<\/li>\n<\/ul>\n<h2>Async\/await enables simple concurrency<\/h2>\n<p>As more features are added to Slack, each request tends to have more work to do. To keep the user experience snappy, concurrency is a common solution \u2014 doing multiple things at the same time in a single request.<\/p>\n<p>In many programming languages, adding concurrency means adding significant complexity with\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Mutual_exclusion\" target=\"_blank\" rel=\"noopener noreferrer\">mutexes<\/a>, thread-safe data structures, or callbacks. These things slow developers down, making code more difficult to reason about and debug.<\/p>\n<p><a href=\"https:\/\/docs.hhvm.com\/hack\/asynchronous-operations\/introduction\" target=\"_blank\" rel=\"noopener noreferrer\">Hack<\/a>\u00a0is one of a\u00a0<a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/csharp\/programming-guide\/concepts\/async\/\" target=\"_blank\" rel=\"noopener noreferrer\">handful<\/a>\u00a0<a href=\"https:\/\/javascript.info\/async-await\" target=\"_blank\" rel=\"noopener noreferrer\">of<\/a>\u00a0<a href=\"https:\/\/blog.rust-lang.org\/2019\/11\/07\/Async-await-stable.html\" target=\"_blank\" rel=\"noopener noreferrer\">languages<\/a>\u00a0that implements the\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Async\/await\" target=\"_blank\" rel=\"noopener noreferrer\">Async\/await<\/a>\u00a0pattern for\u00a0<strong>multitasking without multithreading<\/strong>.<strong>\u00a0<\/strong>Async\/await is a simple abstraction that allows functions to be\u00a0<em>paused\u00a0<\/em>while waiting for I\/O, freeing up the runtime to schedule other tasks. By simply adding the\u00a0<code>async<\/code>\u00a0and\u00a0<code>await<\/code>\u00a0keywords and following a few\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/asynchronous-operations\/guidelines\" target=\"_blank\" rel=\"noopener noreferrer\">guidelines<\/a>, code can be migrated to take advantage of concurrency without breaking the mental model of how the code works.<\/p>\n<p>Here\u2019s an example using the\u00a0<code>concurrent<\/code>\u00a0code block to fetch data from two sources at once. These fetches were previously done sequentially. Adding\u00a0<code>await<\/code>\u00a0and\u00a0<code>concurrent<\/code>\u00a0keeps the code easy to read while allowing the fetches to take place concurrently.<\/p>\n<pre><code class=\"language-php\">async function get_mentions(User $user): Awaitable&lt;vec&lt;Mention&gt;&gt; {\n  concurrent {\n    \/\/ fetch @user mentions\n    $at_mentions = await get_at_mentions($user);\n    \/\/ fetch @channel mentions for channels the user is in\n    $channel_mentions = await get_at_channel_mentions($user); \n  }\n  return sort_mentions($at_mentions, $channel_mentions);\n}<\/code><\/pre>\n<h2>Breaking PHP compatibility frees Hack to grow<\/h2>\n<p>HHVM has come a long way since Slack began using it.\u00a0<a href=\"https:\/\/hhvm.com\/blog\/2018\/09\/12\/end-of-php-support-future-of-hack.html\" target=\"_blank\" rel=\"noopener noreferrer\">Breaking compatibility with PHP<\/a>\u00a0was a controversial decision which required us to eliminate every last line of PHP code and dependencies from our codebase, but has enabled huge efficiency and soundness improvements to the language. Since the\u00a0<a href=\"https:\/\/hhvm.com\/blog\/2019\/02\/11\/hhvm-4.0.0.html\" target=\"_blank\" rel=\"noopener noreferrer\">HHVM 4.0<\/a>\u00a0release that removed PHP support, the developers have\u00a0<a href=\"https:\/\/hhvm.com\/blog\/\" target=\"_blank\" rel=\"noopener noreferrer\">rapidly removed<\/a>\u00a0\u201cPHPisms\u201d that inhibit type safety and\/or performance, while adding\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/generics\/reified-generics\" target=\"_blank\" rel=\"noopener noreferrer\">useful<\/a>\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/expressions-and-operators\/type-assertions\" target=\"_blank\" rel=\"noopener noreferrer\">new<\/a>\u00a0<a href=\"https:\/\/docs.hhvm.com\/hack\/statements\/using\" target=\"_blank\" rel=\"noopener noreferrer\">features<\/a>. Keeping up with these updates in a large codebase is nearly a full time job.<\/p>\n<p>The largest downside to leaving the PHP community is the loss of an extensive ecosystem of open source packages on\u00a0<a href=\"https:\/\/packagist.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Packagist<\/a>. Luckily, Hack projects can still be\u00a0<a href=\"https:\/\/packagist.org\/explore\/?query=hacklang&amp;tags=hacklang~hack\" target=\"_blank\" rel=\"noopener noreferrer\">published<\/a>\u00a0on Packagist, and there are several high quality ones:<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/hhvm\/hhast\" target=\"_blank\" rel=\"noopener noreferrer\">HHAST<\/a>\u00a0enables expressive lint rules and automated code migrations with a Syntax Tree, unlike PHP\u2019s\u00a0<a href=\"https:\/\/github.com\/squizlabs\/PHP_CodeSniffer\" target=\"_blank\" rel=\"noopener noreferrer\">packages<\/a>\u00a0which involve parsing a\u00a0<a href=\"https:\/\/www.php.net\/manual\/en\/function.token-get-all.php\" target=\"_blank\" rel=\"noopener noreferrer\">token stream<\/a><\/li>\n<li>Slack open-sourced\u00a0<a href=\"https:\/\/github.com\/slackhq\/hack-json-schema\" target=\"_blank\" rel=\"noopener noreferrer\">Hack JSON Schema<\/a>, which leverages\u00a0<a href=\"https:\/\/github.com\/hhvm\/hack-codegen\" target=\"_blank\" rel=\"noopener noreferrer\">Hack Codegen<\/a>\u00a0to create Hack code and type definitions from JSON schema definitions<\/li>\n<li><a href=\"https:\/\/github.com\/slackhq\/hack-sql-fake\" target=\"_blank\" rel=\"noopener noreferrer\">Hack SQL Fake<\/a>\u00a0is a library I contributed to simulate MySQL for use in unit tests. It handles millions of SQL queries in every test run at Slack.<\/li>\n<li><a href=\"https:\/\/github.com\/hhvm\/xhp-lib\" target=\"_blank\" rel=\"noopener noreferrer\">XHP<\/a>\u00a0enables type-safe, async server-side rendered HTML and shares a history with React\u2019s JSX. It\u2019s the best server-side HTML framework I\u2019ve worked with.<\/li>\n<\/ul>\n<h2>Looking forward<\/h2>\n<p>As Hack frees itself from its PHP past, I\u2019m excited to see it become a first-class language in its own right. While it\u2019s no longer feasible to gradually migrate a PHP codebase to Hack, I expect to see more developers choose Hack for new projects as the language stabilizes, especially if they have familiarity with PHP and are looking for something better.<\/p>\n<p>There\u2019s a general trend in the industry towards adding static type checking to interpreted languages, with multiple\u00a0<a href=\"https:\/\/pyre-check.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">options<\/a>\u00a0for\u00a0<a href=\"http:\/\/mypy-lang.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Python<\/a>,\u00a0<a href=\"https:\/\/www.typescriptlang.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">JavaScript<\/a>, and\u00a0<a href=\"https:\/\/sorbet.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Ruby<\/a>. Combining the convenience of interpreted languages with static type checking is worth considering for code bases of all sizes.<\/p>\n<hr \/>\n<p><em>Scott Sandler is a Principal Engineer on the Core Infrastructure team at Slack. Slack is\u00a0<\/em><a href=\"https:\/\/slack.com\/careers\/location\/all-locations\/dept\/engineering\" target=\"_blank\" rel=\"noopener noreferrer\"><em>hiring<\/em><\/a><em>\u00a0backend engineers.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"Slack launched in 2014 with a PHP 5 backend. Along with\u00a0several\u00a0other\u00a0companies, we switched to\u00a0HHVM\u00a0in 2016 because it ran our PHP code faster. We stayed with HHVM because it offers an entirely new language:\u00a0Hack\u00a0(searchable as Hacklang). Hack makes our developers faster by improving productivity through better tooling. Hack began as a superset of PHP, retaining its\u00a0best&hellip;","protected":false},"author":139,"featured_media":12729,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[3],"tags":[572,573,607,638],"class_list":{"0":"post-11775","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-uncategorized","8":"tag-hacklang","9":"tag-hhvm","10":"tag-php","11":"tag-static-typing","12":"ts-entry"},"acf":{"subtitle":"How and why Slack migrated to Hack, the benefits it gave us, and things to consider for your own codebase.","author_group":{"configure_author":"wordpress","authors":[{"ID":12082,"post_author":"3","post_date":"2020-04-27 17:30:00","post_date_gmt":"2020-04-27 17:30:00","post_content":"","post_title":"Scott Sandler","post_excerpt":"","post_status":"publish","comment_status":"closed","ping_status":"closed","post_password":"","post_name":"scott-sandler","to_ping":"","pinged":"","post_modified":"2020-05-14 15:35:28","post_modified_gmt":"2020-05-14 15:35:28","post_content_filtered":"","post_parent":0,"guid":"https:\/\/slackhq.com\/engineering\/?p=12082","menu_order":0,"post_type":"author","post_mime_type":"","comment_count":"0","filter":"raw"}],"custom_author":"Scott Sandler"},"tags":[572,573,607,638],"series":false},"jetpack_featured_media_url":"https:\/\/slack.engineering\/wp-content\/uploads\/sites\/7\/2020\/04\/1_j80dms3VT7UVXQ-T9amdpw.jpeg","_links":{"self":[{"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/posts\/11775","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/comments?post=11775"}],"version-history":[{"count":3,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/posts\/11775\/revisions"}],"predecessor-version":[{"id":16367,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/posts\/11775\/revisions\/16367"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/media\/12729"}],"wp:attachment":[{"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/media?parent=11775"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/categories?post=11775"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/slack.engineering\/wp-json\/wp\/v2\/tags?post=11775"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}