{"id":5270,"date":"2026-02-06T19:07:32","date_gmt":"2026-02-07T00:07:32","guid":{"rendered":"https:\/\/dft.wiki\/?p=5270"},"modified":"2026-06-08T09:41:28","modified_gmt":"2026-06-08T13:41:28","slug":"self-hosted-ai-coder-model-for-vscode","status":"publish","type":"post","link":"https:\/\/dft.wiki\/?p=5270","title":{"rendered":"Self-hosted AI Models for Coding and More"},"content":{"rendered":"<p>For an enthusiast, it&#8217;s a whole universe of exploration, privacy, anti-censorship, and more.<\/p>\n<p>The most popular sources for models are:<\/p>\n<ul>\n<li><strong>llama.cpp<\/strong>\n<ul>\n<li>A highly optimized C\/C++ implementation designed to run LLMs locally. [<a href=\"https:\/\/github.com\/ggml-org\/llama.cpp\">Link<\/a>]<\/li>\n<\/ul>\n<\/li>\n<li><strong>Ollama<\/strong> [<a href=\"https:\/\/ollama.com\/\">Link<\/a>]\n<ul>\n<li>A lightweight, extensible framework for running LLMs locally.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Hugging Face<\/strong> [<a href=\"https:\/\/huggingface.co\/\">Link<\/a>]\n<ul>\n<li>The primary model hub for AI enthusiasts and researchers.<\/li>\n<\/ul>\n<\/li>\n<li><strong>LocalAI<\/strong> [<a href=\"https:\/\/localai.io\/\">Link<\/a>]\n<ul>\n<li>A containerized AI stack with a web UI, compatible with the OpenAI API.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><strong>Tip<\/strong><\/p>\n<p>The following websites can test and benchmark whether your computer can run an LLM, and how well.<\/p>\n<ul>\n<li><strong>CanIRun.ai<\/strong> [<a href=\"https:\/\/www.canirun.ai\/\">Link<\/a>]<\/li>\n<li><strong>LLMfit<\/strong> [<a href=\"https:\/\/github.com\/AlexsJones\/llmfit\">Link<\/a>]<\/li>\n<\/ul>\n<hr \/>\n<p><strong><span style=\"font-size: 1rem;\">LLAMA.CPP<\/span><\/strong><\/p>\n<p>Download a model from HuggingFace.<\/p>\n<pre>llama-cli -hf <strong>Qwen\/Qwen2.5-Coder-7B-Instruct-GGUF<\/strong><\/pre>\n<p>Launch an OpenAI-compatible API server (port 8080 by default).<\/p>\n<pre>llama-server -hf <strong>Qwen\/Qwen2.5-Coder-7B-Instruct-GGUF<\/strong><\/pre>\n<p>Or<\/p>\n<pre>llama-server -m <strong>model.gguf<\/strong> --host 0.0.0.0 --port <strong>9000<\/strong><\/pre>\n<p>Inspect and verify a model file&#8217;s metadata and architecture.<\/p>\n<pre>llama-cli -m <strong>model.gguf<\/strong> --info<\/pre>\n<p>Start a continuous conversation.<\/p>\n<pre>llama-cli -m <strong>model.gguf<\/strong> -cnv<\/pre>\n<hr \/>\n<p><strong>OLLAMA<\/strong><\/p>\n<p>Installing Ollama:<\/p>\n<pre>curl -fsSL https:\/\/ollama.com\/install.sh | sh<\/pre>\n<p>Pulling coder models.<\/p>\n<pre>ollama pull <strong>deepseek-coder:1.3b<\/strong>\r\nollama pull <strong>qwen2.5-coder:1.5b-base<\/strong>\r\nollama pull <strong>qwen2.5-coder:3b<\/strong>\r\nollama pull <strong>deepseek-coder:6.7b<\/strong>\r\nollama pull <strong>codellama:7b<\/strong>\r\nollama pull <strong>qwen2.5-coder:7b<\/strong>\r\nollama pull <strong>yi-coder:9b<\/strong>\r\nollama pull <strong>qwen2.5-coder:14b<\/strong>\r\n<\/pre>\n<p>Running a specific model.<\/p>\n<pre>ollama run <strong>yi-coder:9b<\/strong><\/pre>\n<p>To exit the prompt, use <strong>Ctrl + D<\/strong> or type <code>\/bye<\/code>.<\/p>\n<p>Manage your models.<\/p>\n<pre>ollama help\r\nollama list\r\nollama ps\r\nollama stop <strong>yi-coder:9b<\/strong>\r\nollama rm <strong>yi-coder:9b<\/strong><\/pre>\n<hr \/>\n<p><strong>RUNNING AND TESTING<\/strong><\/p>\n<p>To make Ollama reachable over your network, modify the service configuration:<\/p>\n<pre>sudo nano \/etc\/systemd\/system\/ollama.service<\/pre>\n<p>Add the following environment variables to the <code>[Service]<\/code> section to bind to all network interfaces:<\/p>\n<pre>[Unit]\r\nDescription=Ollama Service\r\nAfter=network-online.target\r\n\r\n[Service]\r\n<strong>Environment=\"OLLAMA_HOST=0.0.0.0\"\r\nEnvironment=\"OLLAMA_ORIGINS=*\"<\/strong>\r\nExecStart=\/usr\/local\/bin\/ollama serve\r\nUser=ollama\r\nGroup=ollama\r\nRestart=always\r\nRestartSec=3\r\nEnvironment=\"PATH=\/usr\/local\/sbin:\/usr\/local\/bin:\/usr\/sbin:\/usr\/bin:\/sbin:\/bin\"\r\n\r\n[Install]\r\nWantedBy=default.target<\/pre>\n<p>Reload to apply.<\/p>\n<pre>sudo systemctl daemon-reload\r\nsudo systemctl restart ollama\r\nsudo systemctl status ollama<\/pre>\n<p>From a remote host, test the connection via an HTTP request:<\/p>\n<pre>curl -s http:\/\/192.168.1.101:11434\/api\/generate -d '{\r\n  \"model\": \"<strong>qwen2.5-coder:1.5b-base<\/strong>\",\r\n  \"prompt\": \"When was Python 3 first released?\",\r\n  \"stream\": false\r\n}'<\/pre>\n<p>Example output.<\/p>\n<pre>{\r\n  \"model\": \"qwen2.5-coder:1.5b-base\",\r\n  \"created_at\": \"2026-02-07T01:34:02.891211247Z\",\r\n  \"response\": \"&lt;redacted_for_brevity&gt;\",\r\n  \"done\": true,\r\n  \"done_reason\": \"stop\",\r\n  \"context\": [\r\n    &lt;redacted_for_brevity&gt;\r\n  ],\r\n  \"total_duration\": 260501788901,\r\n  \"load_duration\": 2392809850,\r\n  \"prompt_eval_count\": 7,\r\n  \"prompt_eval_duration\": 687966008,\r\n  \"eval_count\": 1065,\r\n  \"eval_duration\": 253858547320\r\n}<\/pre>\n<p>Success!<\/p>\n<hr \/>\n<p><strong>BENCHMARKING<\/strong><\/p>\n<p>The <code>llm-benchmarking<\/code> tool measures inference speed (tokens per second) on your hardware [<a href=\"https:\/\/pypi.org\/project\/llm-benchmark\/\">Link<\/a>].<\/p>\n<pre>sudo apt install python3-venv -y\r\npython3 -m venv .venv\r\nsource .venv\/bin\/activate\r\npip install llm-benchmark\r\nnano custom.yml<\/pre>\n<pre>file_name: \"custom.yml\"\r\nversion: 2.0.custom\r\nmodels:\r\n- model: \"yi-coder:9b\"\r\n- model: \"codellama:7b\"\r\n- model: \"qwen2.5-coder:7b\"\r\n- model: \"qwen2.5-coder:3b\"\r\n- model: \"qwen2.5-coder:14b\"\r\n- model: \"deepseek-coder:6.7b\"\r\n- model: \"qwen2.5-coder:1.5b-base\"\r\n- model: \"deepseek-coder:1.3b\"<\/pre>\n<pre>llm_benchmark run --custombenchmark=custom.yml<\/pre>\n<p>Here are some acceptable results (summarized for brevity) using an <strong>NVIDIA P4 (PG414) 8GB VRAM<\/strong> [<a href=\"https:\/\/www.techpowerup.com\/gpu-specs\/tesla-p4.c2879\">Link<\/a>].<\/p>\n<pre>----------------------------------------\r\nmodel_name =    <strong>deepseek-coder:1.3b<\/strong>\r\nAverage of eval rate:  <strong>125<\/strong>.67  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>qwen2.5-coder:1.5b-base<\/strong>\r\nAverage of eval rate:  <strong>87<\/strong>.382  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>qwen2.5-coder:3b<\/strong>\r\nAverage of eval rate:  <strong>53<\/strong>.126  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>deepseek-coder:6.7b<\/strong>\r\nAverage of eval rate:  <strong>35<\/strong>.746  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>codellama:7b<\/strong>\r\nAverage of eval rate:  <strong>34<\/strong>.978  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>qwen2.5-coder:7b<\/strong>\r\nAverage of eval rate:  <strong>28<\/strong>.172  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>yi-coder:9b<\/strong>\r\nAverage of eval rate:  <strong>26<\/strong>.79  tokens\/s\r\n----------------------------------------\r\nmodel_name =    <strong>qwen2.5-coder:14b<\/strong>\r\nAverage of eval rate:  <strong>2<\/strong>.286  tokens\/s\r\n----------------------------------------<\/pre>\n<p><strong>Note:<\/strong> Eval rates <span style=\"text-decoration: underline;\">&gt;30 tokens\/s are excellent<\/span>, while &lt;10 tokens\/s are extremely slow. The command <code>ollama ls<\/code> shows which processor is being used. In this case, the model did not fit entirely on the GPU, so the CPU is handling a few layers.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5305\" src=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-13_18-44-47.png\" alt=\"Ollama GPU usage\" width=\"795\" height=\"42\" srcset=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-13_18-44-47.png 795w, https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-13_18-44-47-300x16.png 300w, https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-13_18-44-47-768x41.png 768w\" sizes=\"auto, (max-width: 795px) 100vw, 795px\" \/><\/p>\n<p>Here are slightly better results using an <strong>NVIDIA T4 (PG183) 16GB VRAM<\/strong> [<a href=\"https:\/\/www.techpowerup.com\/gpu-specs\/tesla-t4.c3316\">Link<\/a>].<\/p>\n<pre>(pending)<\/pre>\n<p>Here are some unacceptable results that fell back to the CPU.<\/p>\n<pre>----------------------------------------\r\nmodel_name = <strong>deepseek-coder:1.3b<\/strong>\r\nAverage of eval rate: <strong>7<\/strong>.364 tokens\/s\r\n----------------------------------------\r\nmodel_name = <strong>qwen2.5-coder:1.5b-base<\/strong>\r\nAverage of eval rate: <strong>5<\/strong>.154 tokens\/s\r\n----------------------------------------\r\nmodel_name = <strong>deepseek-coder-v2:16b<\/strong>\r\nAverage of eval rate: <strong>3<\/strong>.428 tokens\/s\r\n----------------------------------------\r\nmodel_name = <strong>qwen2.5-coder:3b<\/strong>\r\nAverage of eval rate: <strong>2<\/strong>.942 tokens\/s\r\n----------------------------------------\r\nmodel_name = <strong>deepseek-coder:6.7b<\/strong> \r\nAverage of eval rate: <strong>1<\/strong>.328 tokens\/s \r\n---------------------------------------- \r\nmodel_name = <strong>qwen2.5-coder:14b<\/strong>\r\nAverage of eval rate: <strong>0<\/strong>.704 tokens\/s\r\n----------------------------------------\r\n<\/pre>\n<p><strong>Note:<\/strong> Without GPU acceleration, eval rates are extremely slow.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5284\" src=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-08_17-11-35.png\" alt=\"\" width=\"701\" height=\"40\" srcset=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-08_17-11-35.png 701w, https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-08_17-11-35-300x17.png 300w\" sizes=\"auto, (max-width: 701px) 100vw, 701px\" \/><\/p>\n<p style=\"text-align: left;\">If this is your situation (as in the example above), verify that the GPU drivers and kernel modules are properly installed and loaded. Choose the architecture carefully: AMD (Polaris, Vega, <strong>RDNA 3+<\/strong>) or NVIDIA (Pascal, Turing, <strong>Ampere<\/strong>).<\/p>\n<p>As a last resort, try overwriting the service file:<\/p>\n<pre>nano \/etc\/systemd\/system\/ollama.service<\/pre>\n<pre>[Service]\r\nEnvironment=\"<strong>HSA_OVERRIDE_GFX_VERSION=8.0.3<\/strong>\"\r\n#Environment=\"<strong>OLLAMA_VULKAN=1<\/strong>\"<\/pre>\n<pre>systemctl daemon-reload\r\nsystemctl restart ollama\r\nsleep 10\r\njournalctl -u ollama.service --since \"1 minute ago\"<\/pre>\n<p>In the following output, it was unsuccessful. Hopefully you&#8217;ll have better luck.<\/p>\n<pre>...\r\nlevel=INFO source=runner.go:67 msg=\"discovering available GPUs...\"\r\nlevel=WARN source=runner.go:485 msg=<strong>\"user overrode visible devices\" HSA_OVERRIDE_GFX_VERSION=8.0.3<\/strong>\r\nlevel=WARN source=runner.go:489 msg=\"if GPUs are not correctly discovered, unset and try again\"\r\n...<\/pre>\n<hr \/>\n<p><strong>INTEGRATION<\/strong><\/p>\n<p>On <strong>VS Code<\/strong>, install popular extensions like <strong>Continue<\/strong> or <strong>Roo Code<\/strong> to use your local models for autocomplete and chat.<\/p>\n<p>Example configuration for <strong>Continue<\/strong>.<\/p>\n<pre>nano ~\/.continue\/config.yaml<\/pre>\n<pre>name: Local Config\r\nversion: 1.0.0\r\nschema: v1\r\nmodels:\r\n  - name: \"Nerdsking Python 7B\"\r\n    provider: ollama\r\n    model: hf.co\/Nerdsking\/Nerdsking-python-coder-7B-i:latest\r\n    apiBase: http:\/\/192.168.1.101:11434\r\n    roles: [\"autocomplete\", \"edit\", \"apply\", \"chat\"]\r\n\r\n  - name: \"Qwen2.5 Coder 7B\"\r\n    provider: ollama\r\n    model: qwen2.5-coder:7b\r\n    apiBase: http:\/\/192.168.1.101:11434\r\n    roles: [\"autocomplete\", \"edit\", \"apply\", \"chat\"]\r\n\r\n  - name: \"YI-coder 9B\"\r\n    provider: ollama\r\n    model: yi-coder:9b\r\n    apiBase: http:\/\/192.168.1.101:11434\r\n    roles: [\"autocomplete\", \"edit\", \"apply\", \"chat\"]\r\n\r\ntabAutocompleteModel:\r\n  name: \"Nerdsking Python 3B\"\r\n  provider: ollama\r\n  model: hf.co\/Nerdsking\/nerdsking-python-coder-3B-i:Q8_0\r\n  apiBase: http:\/\/192.168.1.101:11434<\/pre>\n<p>Example configuration for <strong>Roo Code<\/strong>.<\/p>\n<pre>(pending)<\/pre>\n<hr \/>\n<p><strong>HUGGING FACE &amp; GGUF<\/strong><\/p>\n<p>Hugging Face makes it easy to use a model on virtually any framework, app, or cloud.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5325\" src=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-19_22-16-26.png\" alt=\"\" width=\"265\" height=\"624\" srcset=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-19_22-16-26.png 265w, https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-19_22-16-26-127x300.png 127w\" sizes=\"auto, (max-width: 265px) 100vw, 265px\" \/><\/p>\n<p><strong>GGUF<\/strong> (GPT-Generated Unified Format)<\/p>\n<ul>\n<li>A binary model file format<\/li>\n<li>Bundles quantization and metadata in a single file<\/li>\n<li>Enables fully offline inference<\/li>\n<\/ul>\n<p>If the model on HF includes a <code>.gguf<\/code> file, pulling it into Ollama is as simple as:<\/p>\n<pre>ollama pull hf.co\/Nerdsking\/Nerdsking-python-coder-7B-i<\/pre>\n<p>Or<\/p>\n<pre>ollama pull hf.co\/Nerdsking\/nerdsking-python-coder-3B-i:Q8_0<\/pre>\n<p>Alternatively, download the <code>.gguf<\/code> file manually and create a model from it.<\/p>\n<pre>echo 'FROM .\/flux1-dev-Q4_1.gguf' &gt; Modelfile\r\nollama create flux1-dev-Q4_1 -f Modelfile<\/pre>\n<ul>\n<li>Why run Hugging Face models with Ollama?\n<ul>\n<li>It&#8217;s the easiest way to interact with models, especially for coding integrations.<\/li>\n<\/ul>\n<\/li>\n<li>What are Ollama&#8217;s limitations?\n<ul>\n<li>For specialized models, such as image generation, it&#8217;s better to use dedicated apps or write your own code.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><strong>Downloading Models<\/strong><\/p>\n<p>Install the HF CLI tool.<\/p>\n<pre>curl -LsSf https:\/\/hf.co\/cli\/install.sh | bash\r\ncurl -sSfL https:\/\/hf.co\/git-xet\/install.sh | sudo sh<\/pre>\n<p>Optionally, log in to HF.<\/p>\n<pre>hf auth login<\/pre>\n<p>Preview the download, then proceed.<\/p>\n<pre>hf download unsloth\/Z-Image-GGUF z-image-Q6_K.gguf --local-dir .\/models --dry-run\r\nhf download unsloth\/Z-Image-GGUF z-image-Q6_K.gguf --local-dir .\/models<\/pre>\n<p>Or download the entire repository.<\/p>\n<pre>hf download black-forest-labs\/FLUX.1-dev<\/pre>\n<p><strong>Coding with AI<\/strong><\/p>\n<p>There are multiple ways to install dependencies, but here I&#8217;ll demonstrate using pre-built container images for image generation.<\/p>\n<pre>docker pull huggingface\/transformers-pytorch-gpu:latest\r\ndocker run -it --gpus all -v $(pwd)\/models:\/root\/.cache\/huggingface --name hf huggingface\/transformers-pytorch-gpu:latest \/bin\/bash<\/pre>\n<pre>python3 -c \"import torch; print(f'CUDA Available: {torch.cuda.is_available()}'); print(f'Device Name: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \\\"None\\\"}')\"<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5331\" src=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-21_13-04-03.png\" alt=\"\" width=\"183\" height=\"42\" \/><\/p>\n<p>Build with CUDA support to use your GPU.<\/p>\n<pre>CMAKE_ARGS=\"-DSD_CUDA=ON\" pip install stable-diffusion-cpp-python<\/pre>\n<p>Install common dependencies.<\/p>\n<pre>pip install diffusers accelerate transformers sentencepiece\r\npip install --root-user-action=ignore gguf<\/pre>\n<p>Now create and run the example script for your chosen model on HF. Here is a basic example.<\/p>\n<pre>from diffusers import DiffusionPipeline\r\nimport torch\r\npipe = DiffusionPipeline.from_pretrained(\r\n    \"<strong>&lt;MODEL_NAME_HERE&gt;<\/strong>\",\r\n    torch_dtype=torch.float16\r\n)\r\npipe.to(\"cuda\")\r\n<strong>prompt = \"Happy family, detailed lighting, 8k\"\r\nnegative_prompt = \"low quality, blurry, deformed\"<\/strong>\r\nimage = pipe(\r\n    prompt=prompt,\r\n    negative_prompt=negative_prompt,\r\n    num_inference_steps=18,\r\n    guidance_scale=8\r\n).images[0]\r\nimage.save(\"generated_image.png\")<\/pre>\n<p>To return to or open a new terminal in the container:<\/p>\n<pre>docker start hf\r\ndocker exec -it hf \/bin\/bash<\/pre>\n<p>Alternatively, look for instructions on running interesting Spaces locally.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5333\" src=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-22_01-09-02.png\" alt=\"\" width=\"371\" height=\"91\" srcset=\"https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-22_01-09-02.png 371w, https:\/\/dft.wiki\/wp-content\/uploads\/sites\/15\/2026\/02\/Screenshot_2026-02-22_01-09-02-300x74.png 300w\" sizes=\"auto, (max-width: 371px) 100vw, 371px\" \/><\/p>\n<hr \/>\n<p><strong>LOCALAI<\/strong><\/p>\n<p>LocalAI is a full-featured AI stack that runs in Docker, making it OS-agnostic.<\/p>\n<pre>curl -L https:\/\/install.localai.io | sh<\/pre>\n<p>Or<\/p>\n<pre>docker run --gpus all -p 8080:8080 -v $(<span class=\"hljs-built_in\">pwd<\/span>)\/models:\/models -v $(<span class=\"hljs-built_in\">pwd<\/span>)\/backends:\/backends --name local-ai -itd localai\/localai:latest<\/pre>\n<p>Models and backend engines are stored externally, keeping the container ephemeral while persisting your data.<\/p>\n<p>Access the dashboard at <strong>http:\/\/localhost:8080<\/strong><\/p>\n<hr \/>\n<p><strong>READ MORE<\/strong><\/p>\n<p><strong>Interacting Directly with Ollama&#8217;s API<\/strong> [<a href=\"https:\/\/dft.wiki\/?p=5308\">Link<\/a>]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For an enthusiast, it&#8217;s a whole universe of exploration, privacy, anti-censorship, and more. The most [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[],"class_list":["post-5270","post","type-post","status-publish","format-standard","hentry","category-ai"],"_links":{"self":[{"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/posts\/5270","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dft.wiki\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5270"}],"version-history":[{"count":47,"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/posts\/5270\/revisions"}],"predecessor-version":[{"id":5595,"href":"https:\/\/dft.wiki\/index.php?rest_route=\/wp\/v2\/posts\/5270\/revisions\/5595"}],"wp:attachment":[{"href":"https:\/\/dft.wiki\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5270"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dft.wiki\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5270"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dft.wiki\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5270"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}