{"id":22498,"date":"2025-05-20T09:00:29","date_gmt":"2025-05-20T16:00:29","guid":{"rendered":"https:\/\/engineering.fb.com\/?p=22498"},"modified":"2025-05-19T20:54:00","modified_gmt":"2025-05-20T03:54:00","slug":"metas-full-stack-hhvm-optimizations-for-genai","status":"publish","type":"post","link":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/","title":{"rendered":"Meta&#8217;s Full-stack HHVM optimizations for GenAI"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">As Meta has launched new, innovative products leveraging generative AI (GenAI), we need to make sure the underlying infrastructure components evolve along with it. Applying infrastructure knowledge and optimizations have allowed us to adapt to changing product requirements, delivering a better product along the way. Ultimately, our infrastructure systems need to balance our need to ship high-quality experiences with a need to run systems sustainability. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Splitting GenAI inference traffic out into a dedicated WWW tenant, which allows specialized runtime and warm-up configuration, has enabled us to meet both of those goals while delivering a 30% improvement in latency.\u00a0<\/span><\/p>\n<div class=\"jetpack-video-wrapper\"><span class=\"embed-youtube\" style=\"text-align:center; display: block;\"><iframe loading=\"lazy\" class=\"youtube-player\" width=\"4000\" height=\"2250\" src=\"https:\/\/www.youtube.com\/embed\/QBIqvBy3lqg?version=3&#038;rel=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;fs=1&#038;hl=en-US&#038;autohide=2&#038;wmode=transparent\" allowfullscreen=\"true\" style=\"border:0;\" sandbox=\"allow-scripts allow-same-origin allow-popups allow-presentation allow-popups-to-escape-sandbox\"><\/iframe><\/span><\/div>\n<h2><span style=\"font-weight: 400;\">Who we are<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">As the Web Foundation team, we operate Meta\u2019s monolithic web tier, running <\/span><a href=\"https:\/\/hacklang.org\/\"><span style=\"font-weight: 400;\">Hack<\/span><\/a><span style=\"font-weight: 400;\">. The team is composed of <\/span><span style=\"font-weight: 400;\">cross-functional engineers who make sure the infrastructure behind the web tier is healthy and well designed. <\/span><span style=\"font-weight: 400;\">We jump into incident response, work on some of the most complex areas of the infrastructure, and help build whatever we need to keep the site happily up and running.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To accomplish this, we have established a series of best practices on being a \u201cgood citizen\u201d of the shared tier. We need to ensure that all requests comply with these guidelines to prevent issues from spilling over and affecting other teams\u2019 products. One core rule is the request runtime\u2014limiting a request to 30 seconds of execution. This is a consequence of the <\/span><a href=\"https:\/\/docs.hhvm.com\/hhvm\/\"><span style=\"font-weight: 400;\">HHVM (<\/span><span style=\"font-weight: 400;\">HipHop Virtual Machine) <\/span><span style=\"font-weight: 400;\">runtime<\/span><\/a><span style=\"font-weight: 400;\">\u2014each request has a corresponding worker thread, of which there is a finite number. To ensure there are always threads available to serve incoming requests, we need to balance the resources available on each host with its expected throughput. If requests are taking too long, there will be fewer available threads to process new requests, leading to user-visible unavailability.\u00a0<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">The changing landscape<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Classically, webservers at Meta are optimized for serving front-end requests\u2014rendering webpages and serving GraphQL queries. These requests\u2019 latency is typically measured in hundreds of milliseconds to seconds (substantially below the 30-second limit), which enables hosts to process approximately 500 queries per second.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Additionally, a web server will spend about two-thirds of its time doing input\/output (I\/O), and the remaining third doing CPU work. This fact has influenced the design of the Hack language, which supports <\/span><span style=\"font-weight: 400;\">asyncio<\/span><span style=\"font-weight: 400;\">, a type of cooperative multi-tasking, and all the core libraries support these primitives to increase performance and decrease the amount of time the CPU is sitting idle, waiting for I\/O.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GenAI products, especially LLMs, have a different set of requirements. These are driven by the core inference flow: The model responds with a stream of tokens that can take seconds or minutes to complete. A user may see this as a chatbot \u201ctyping\u201d a response. This isn\u2019t an effect to make our products seem friendlier; it\u2019s the speed at which our models think! After a user submits a query to the model, we need to start streaming these responses back to the user as fast as possible. On top of that, the total latency of the request is now substantially longer (measured in seconds). These properties have two effects on the infrastructure\u2014minimal overhead on the critical path before calling the LLM, and a long duration for the rest of the request, most of which is spent waiting on I\/O. (See Figures 1 and 2 below).<\/span><\/p>\n<figure id=\"attachment_22500\" aria-describedby=\"caption-attachment-22500\" style=\"width: 600px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22500\" src=\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?w=1024\" alt=\"\" width=\"600\" height=\"288\" srcset=\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png 1419w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?resize=916,440 916w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?resize=768,369 768w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?resize=1024,491 1024w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?resize=96,46 96w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-figure-1.png?resize=192,92 192w\" sizes=\"auto, (max-width: 992px) 100vw, 62vw\" \/><figcaption id=\"caption-attachment-22500\" class=\"wp-caption-text\">Figure 1: Percent of time spent on I\/O, typical requests (~70%) vs. GenAI (~90%).<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<figure id=\"attachment_22501\" aria-describedby=\"caption-attachment-22501\" style=\"width: 600px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-22501\" src=\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?w=1024\" alt=\"\" width=\"600\" height=\"395\" srcset=\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png 1314w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?resize=916,602 916w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?resize=768,505 768w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?resize=1024,673 1024w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?resize=96,63 96w, https:\/\/engineering.fb.com\/wp-content\/uploads\/2025\/05\/Meta-GenAI-HHVM-image3.png?resize=192,126 192w\" sizes=\"auto, (max-width: 992px) 100vw, 62vw\" \/><figcaption id=\"caption-attachment-22501\" class=\"wp-caption-text\">Figure 2: Overall request latency CDF; typical requests vs. GenAI.<\/figcaption><\/figure>\n<h2><span style=\"font-weight: 400;\">A series of optimizations<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">This shift in requirements allowed Web Foundation to reexamine the rules of running the monolithic web tier. We then launched a dedicated web tenant (a standalone deployment of WWW) that allowed custom configuration, which we could better tune to the needs of the workload.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Request timeout<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">First, running on an isolated web tier allowed us to increase the runtime limit for GenAI requests. This is a straightforward change, but it allowed us to isolate the longer-running traffic to avoid adverse impacts on the rest of the production tier. This way, we can avoid requests timing out if inference takes longer than 30 seconds.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Thread-pool sizing<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Running requests for longer means there is reduced availability of worker threads (which, remember, map 1:1 with processed requests). Since webservers have a finite amount of memory, we can divide the total memory available by the per-request memory limit to get a peak number of active requests; this in turn tells us how many requests we can execute simultaneously. We ended up running with approximately 1000 threads on GenAI hosts, as compared to a couple of hundred on normal webservers.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">JIT cache and \u201cjumpstart\u201d<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">HHVM is a just-in-time (JIT) interpreted language, which means the first time a given function executes, the machine needs to compile it to lower-level machine code for execution. Additionally, a technique called <\/span><a href=\"https:\/\/engineering.fb.com\/2021\/03\/03\/developer-tools\/hhvm-jump-start\/\"><span style=\"font-weight: 400;\">Jump-Start<\/span><\/a><span style=\"font-weight: 400;\"> allows a webserver to seed its JIT cache with outputs from a previously warmed server. By allowing GenAI hosts to use Jump-Start profiles from the main web tier, we are able to greatly speed up execution, even if the code overlap is not identical.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Request warm-up<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">HHVM also supports the execution of dummy requests at server startup, which we can execute, and then we can discard the results. The intent here is to warm non-code caches within the webserver. Configuration values and service discovery info are normally fetched inline the first time they are needed and then cached within the webserver. By fetching and caching this information in warm-up requests, we prevent our users from observing the latency of these initial fetches.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Shadow traffic<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Finally, Meta heavily uses real-time configuration to control feature rollouts, which means that jumpstart profiles consumed at startup time might not cover all <\/span><i><span style=\"font-weight: 400;\">future<\/span><\/i><span style=\"font-weight: 400;\"> code paths the server will execute. To maintain coverage in the steady state, we also added request shadowing, so we can ensure that gating changes are still covered in the JIT cache.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As Meta has launched new, innovative products leveraging generative AI (GenAI), we need to make sure the underlying infrastructure components evolve along with it. Applying infrastructure knowledge and optimizations have allowed us to adapt to changing product requirements, delivering a better product along the way. Ultimately, our infrastructure systems need to balance our need to [&#8230;]<\/p>\n<p><a class=\"btn btn-secondary understrap-read-more-link\" href=\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\">Read More&#8230;<\/a><\/p>\n","protected":false},"author":51,"featured_media":20686,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[65,66,67,6],"tags":[],"class_list":["post-22498","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-infrastructure","category-developer-tools","category-production-engineering","category-web","fb_content_type-article"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v19.3 (Yoast SEO v19.12) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Meta&#039;s Full-stack HHVM optimizations for GenAI - Engineering at Meta<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Phil Lopreiato, Zach Zundel\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\"},\"author\":{\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#author\",\"name\":\"\"},\"headline\":\"Meta&#8217;s Full-stack HHVM optimizations for GenAI\",\"datePublished\":\"2025-05-20T16:00:29+00:00\",\"dateModified\":\"2025-05-20T03:54:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\"},\"wordCount\":1005,\"publisher\":{\"@id\":\"https:\/\/engineering.fb.com\/#organization\"},\"articleSection\":[\"Data Infrastructure\",\"DevInfra\",\"Production Engineering\",\"Web\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\",\"url\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\",\"name\":\"Meta's Full-stack HHVM optimizations for GenAI - Engineering at Meta\",\"isPartOf\":{\"@id\":\"https:\/\/engineering.fb.com\/#website\"},\"datePublished\":\"2025-05-20T16:00:29+00:00\",\"dateModified\":\"2025-05-20T03:54:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/engineering.fb.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Meta&#8217;s Full-stack HHVM optimizations for GenAI\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/engineering.fb.com\/#website\",\"url\":\"https:\/\/engineering.fb.com\/\",\"name\":\"Engineering at Meta\",\"description\":\"Engineering at Meta Blog\",\"publisher\":{\"@id\":\"https:\/\/engineering.fb.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/engineering.fb.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/engineering.fb.com\/#organization\",\"name\":\"Meta\",\"url\":\"https:\/\/engineering.fb.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/engineering.fb.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2023\/08\/Meta_lockup_positive-primary_RGB.jpg\",\"contentUrl\":\"https:\/\/engineering.fb.com\/wp-content\/uploads\/2023\/08\/Meta_lockup_positive-primary_RGB.jpg\",\"width\":29011,\"height\":12501,\"caption\":\"Meta\"},\"image\":{\"@id\":\"https:\/\/engineering.fb.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/Engineering\/\",\"https:\/\/twitter.com\/fb_engineering\"]},[]]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Meta's Full-stack HHVM optimizations for GenAI - Engineering at Meta","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/","twitter_misc":{"Written by":"Phil Lopreiato, Zach Zundel","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#article","isPartOf":{"@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/"},"author":{"@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#author","name":""},"headline":"Meta&#8217;s Full-stack HHVM optimizations for GenAI","datePublished":"2025-05-20T16:00:29+00:00","dateModified":"2025-05-20T03:54:00+00:00","mainEntityOfPage":{"@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/"},"wordCount":1005,"publisher":{"@id":"https:\/\/engineering.fb.com\/#organization"},"articleSection":["Data Infrastructure","DevInfra","Production Engineering","Web"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/","url":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/","name":"Meta's Full-stack HHVM optimizations for GenAI - Engineering at Meta","isPartOf":{"@id":"https:\/\/engineering.fb.com\/#website"},"datePublished":"2025-05-20T16:00:29+00:00","dateModified":"2025-05-20T03:54:00+00:00","breadcrumb":{"@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/engineering.fb.com\/2025\/05\/20\/web\/metas-full-stack-hhvm-optimizations-for-genai\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/engineering.fb.com\/"},{"@type":"ListItem","position":2,"name":"Meta&#8217;s Full-stack HHVM optimizations for GenAI"}]},{"@type":"WebSite","@id":"https:\/\/engineering.fb.com\/#website","url":"https:\/\/engineering.fb.com\/","name":"Engineering at Meta","description":"Engineering at Meta Blog","publisher":{"@id":"https:\/\/engineering.fb.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/engineering.fb.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/engineering.fb.com\/#organization","name":"Meta","url":"https:\/\/engineering.fb.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/engineering.fb.com\/#\/schema\/logo\/image\/","url":"https:\/\/engineering.fb.com\/wp-content\/uploads\/2023\/08\/Meta_lockup_positive-primary_RGB.jpg","contentUrl":"https:\/\/engineering.fb.com\/wp-content\/uploads\/2023\/08\/Meta_lockup_positive-primary_RGB.jpg","width":29011,"height":12501,"caption":"Meta"},"image":{"@id":"https:\/\/engineering.fb.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/Engineering\/","https:\/\/twitter.com\/fb_engineering"]},[]]}},"jetpack_featured_media_url":"https:\/\/engineering.fb.com\/wp-content\/uploads\/2023\/10\/Eng-Blog-Self-Serve-Hero-Images-DEBUGGING-201-Teale.jpg","jetpack_shortlink":"https:\/\/wp.me\/pa0Lhq-5QS","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/posts\/22498","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/users\/51"}],"replies":[{"embeddable":true,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/comments?post=22498"}],"version-history":[{"count":6,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/posts\/22498\/revisions"}],"predecessor-version":[{"id":22524,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/posts\/22498\/revisions\/22524"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/media\/20686"}],"wp:attachment":[{"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/media?parent=22498"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/categories?post=22498"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/engineering.fb.com\/wp-json\/wp\/v2\/tags?post=22498"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}