DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

(esengine.github.io)

70 points | by Alifatisk 2 hours ago

14 comments

embedding-shape 1 hour ago
I'm not sure you need a "DeepSeek native coding agent" to take advantage of DeepSeeks cache, yesterday as the Codex quota usage issue still wasn't solved for me, I wrote a tiny little bridge so I could use DeepSeek V4 Pro via Codex, and seems most of everything I did was basically cached as far as I can tell: https://i.imgur.com/7eKn6wN.png (2026-05-23 Input (Cache hit): 39,123,200 tokens, Input (Cache miss) 1,692,286), and the bridge is doing not special, just massage the DeepSeek API shape into what Codex expects, nothing particular about caching at all.
Besides being even better at the caching, I'm not sure what benefits you'd get compared to just firing up OpenCode with the DeepSeek API yourself, it'll similarly do caching for sure and also "talks directly to api.deepseek.com" if that matters, and you'll get a much more mature harness.
[-]
- 3uler 34 minutes ago
  Opencode has really bad cache stability issues that they seem uninterested in fixing at the moment.
  [-]
  - embedding-shape 13 minutes ago
    That'd be really easy to spot and also fix, most likely. Any open issue you could point us to, must surely been reported already?
- bwfan123 1 hour ago
  > I wrote a tiny little bridge so I could use DeepSeek V4 Pro via Codex
  Can you share the bridge. DeepSeek v4 is awesome paired with claude-code or opencode. I found that claude code costs me less than opencode and I am presuming this is due to a better engineered harness.
  [-]
  - embedding-shape 1 hour ago
    Sure, keep in mind it's a steaming pile of hacked together hacks, probably won't work in every case, doesn't support every feature that should be supported (like parallel tool calling, both Codex + DeepSeek API support it), and it might make your computer catch on fire: https://gist.github.com/embedding-shapes/eab3e63e5a95d3d78a2...
    I only used it for a few hours to play around with stuff before the quota issue was fixed and I could resume using GPT models, and the bridge was coded by DeepSeek-V4-Flash-IQ2XXS + DwarfStar4 locally, I take no responsibility for what might happen with your computer or you, during usage or just reading the code.
    Edit: heh, like don't look at line 117 for example where seemingly it likes to handle misspellings in the .env file which totally wasn't my fault for typo'ing the API key in that file... I'm sure there are tons of sharp edges and dumb stuff in there.
  - Den_VR 26 minutes ago
    I’m feeling more a novice every day, but how isn’t this just handing over your code to team deepseek for whatever they might want
    [-]
    - embedding-shape 14 minutes ago
      Not everyone is working with state secrets or user personal data (or even more closely guarded, company secrets) on a daily basis, most of what I hack on is either FOSS already, or will be, not much to keep secret here.
      Obviously, if you do deal with any sort of secrets, then using local LLMs over OpenAI, Anthropic, DeepSeek or whoever is obviously preferred, and in the case of personal data of users, probably a requirement.
- himata4113 1 hour ago
  this appears to be native to the terminal, as in, there's no special application that runs or wraps an agent inside a tui. So basically instead of commands you type plain english?
  [-]
  - embedding-shape 1 hour ago
    > this appears to be native to the terminal, as in, there's no special application that runs or wraps an agent inside a tui
    Same with codex? codex-rs at least, is a TUI as well, it does run a "app-server" in the background, that the TUI actually interacts with, but that's just an implementation detail. Also makes it easy to hook in your own programs to fire of codex "headless" sessions even without the TUI.
skeledrew 1 hour ago
Not a fan of that page. The animated typing and resulting continuous resize of the example keeps moving the content beneath it down and up. Such bad UX.
[-]
- embedding-shape 54 minutes ago
  Agents or no agents, people still need to test their websites on different resolutions or at least window width, but seems this is becoming a lost art.
  [-]
  - mirekrusin 15 minutes ago
    Yeah, doesn’t look designed for people who want to read it beyond animated typing animation.
schaefer 34 minutes ago
Okay, I'm curious.
From the FAQ, I see:
>Can I point it at a self-hosted / private DeepSeek endpoint?
>Yes. Since 0.30 we accept non-standard key prefixes for self-hosted DeepSeek endpoints. Just point `baseUrl` at your internal address — the loop, cache strategy, and tool protocol are unchanged.
But my question is: If I use Reasonix to talk to a deepseek endpoint through openrouter, am I still getting the cache-hit benifits of this agent harness?
[-]
- csunoser 18 minutes ago
  Yes*. At least from my limited usage of deepseek-flash for a few billion tokens on openrouter, the cache-hit rate is >95%. And I simply used the claude code harness pointed at the openrouter anthropic compatible endpoint with no fluff.
  [-]
  - schaefer 8 minutes ago
    thank you!
unshavedyak 13 minutes ago
It's pretty funny, i'm a $200/m Claude subscriber and i've had little need to use anything else. However the more Claude has been restricting my workflow (notably around the recent IDE/-p usage change) the more i've been wanting to go elsehwere.
I'm concerned since i really want SOTA reasoning, but DeepSeek still has me interested.
declan_roberts 1 hour ago
I love the focus on cache hit efficiency. Hats off to the deekseek team for creating a great product that maximizes cost efficiency for the user.
[-]
- bwfan123 51 minutes ago
  > Hats off to the deekseek team for creating a great product
  I have been using it for a while, and I wholeheartedly agree. imo, it is as good as codex or claude which I also use. It is a winner in the cost-sensitive tier, and if some startup could put it together with data-retention in mind, it could be a great product sold to the enterprise, as data-retention and privacy are the main issues for the coding-assistant usecase.
  [-]
  - chillfox 19 minutes ago
    Deepseek v4 pro is definitely my preferred cheap model, it's very good, and I use it all the time for my personal projects (opencode go plan), but I also use Claude Opus all the time at work and Deepseek is not as good as that, but it does compete with Sonnet for capability, and beats it on price.
- stavros 45 minutes ago
  How can you have cache hit efficiency? Isn't it just a matter of not changing the previous context? I don't understand what knobs there are to tweak on this.
  [-]
  - everforward 28 minutes ago
    > Isn't it just a matter of not changing the previous context?
    Yes, but a lot of harnesses change previous context. E.g. the system prompt injects the current time/date, working directory, files in the working directory, etc. Compaction also changes the whole previous context. I _think_ changing the list of tools also invalidates cache, so invoking a subagent with different tools would invalidate the cache.
    My vague impression is that it's in a similar vein to functional programming languages. It generally disallows doing things that lead to bugs (cache misses in this case), and presumably allows you to do those things in a way that makes it much clearer that this is likely to cause cache misses. I would guess that in this paradigm, you don't mutate your existing session, you derive a new session by mutating the prior context into a new context.
    [-]
    - chillfox 16 minutes ago
      changing between plan/build mode in some agents will change the tools list, which breaks the cache.
mmaunder 26 minutes ago
Unusable thanks to the top animation pushing the rest of the site down repeatedly as you’re trying to read.
singiamtel 9 minutes ago
I would've liked benchmarks against other harnesses showing the caching performance
hirako2000 1 hour ago
Good timing given the cost spike across other frontier models.
[-]
- notjes 55 minutes ago
  Good thing DS just made their discount permanent. https://x.com/deepseek_ai/status/2057854261699195173
theanonymousone 1 hour ago
Isn't caching a server-side thing? How does the agent affect it, significantly at least?
[-]
- embedding-shape 56 minutes ago
  Say you put the current time down to the second in the system prompt, which is the message that goes in front of the entire conversation, then basically nothing will be cached, every agent turn needs to ingest the entire session over and over. Contrast to not doing that, and the backend can leverage caching all the way up to the latest message, as nothing until then changed.
  [-]
  - esperent 35 minutes ago
    Surely other agent CLIs are not dumb enough to invalidate cache on every turn over something so obvious?
    [-]
    - chillfox 13 minutes ago
      I don't think any the agents breaks caching on every turn, but they might do things like current list of files, or available tools depending upon plan/build mode... or lots of other things that breaks caching multiple times during a session.
    - embedding-shape 16 minutes ago
      Obviously not, most agents properly keep previous messages unchanged, at least the major ones I've been digging into the source off. Also, everything would get so much slower, that even developers creating their own agents would notice quickly how much slower theirs is, if they fuck this up.
sergiotapia 1 hour ago
What AI model did you use for the website design? This is the second one I see with the exact same font and color scheme. Just curious because Claude models lean towards purples for example. Thank you!
[-]
- pcwelder 22 minutes ago
  Opus 4.7 selects such palette and motifs by default. Might even be first iteration of claude design.
- FergusArgyll 4 minutes ago
  Frontend design skill by Anthropic specifically says not to use purple. I'd be surprised if it still uses purple. Have you seen that recently?
- franga2000 52 minutes ago
  This design still screams Claude to me, but a newer version than what you're thinking of. At some point they added a markdown file that tells it to use obviously AI designs like lots of blue/purple and gradients. Since then, this is its new style.
- sheepscreek 41 minutes ago
  DeepSeek v4 perhaps?
canadiantim 1 hour ago
So what's best low cost coding agent these days? Kimi 2.6? Qwen's latest closed model? Composer 2.5? DeepSeek?
[-]
- passive 1 hour ago
  I've gone through ~600m tokens in Xiaomi Mimo though Claude, and it's been the most effective use of an agent I've had yet. It's very capable, but generally not ambitious, picking simple but effective solutions to most problems I give it. Going to write something longer about the experience when I get to a billion tokens.
  [-]
  - Alifatisk 22 minutes ago
    I do have my eyes on the coding plan, which is quite generous.
    https://mimo.mi.com
  - gandreani 1 hour ago
    Are you using Mimo 2.5 pro?
- skeledrew 1 hour ago
  Seems to be DeepSeek.
  https://news.ycombinator.com/item?id=48237663
- bwfan123 1 hour ago
  In my experience, it is claude-code paired with deepseek-v4. For penny-pinchers like me, I can have long coding sessions with it with no anxiety about the cost. Also, prompting it to what you want and verifying the outputs is more important than the quality of the model. So, I am better off with a cheaper model and taking the responsibility for prompting it and verifying the results.
  [-]
  - esperent 34 minutes ago
    It's obviously much cheaper paying by the token but how does it compare to a codex subscription on cost?
  - epolanski 50 minutes ago
    Can you quantify the actual costs in a week and the use you make?
    [-]
    - wongarsu 22 minutes ago
      Not GP, but for my use I'd estimate $0.10-0.30 per hour of use per agent with DeepSeek v4 Pro
- ac29 1 hour ago
  Kimi 2.6 is great. Qwen3.7-max benchmarks similarly but I havent used it yet
- stavros 56 minutes ago
  For me, it's by far Deepseek. It's many times cheaper than competitors, and about as good as Sonnet 4.6.
- lostmsu 1 hour ago
  Just use codex with 5.5 on low reasoning levels
the_mitsuhiko 49 minutes ago
[dead]