A way to exclude sensitive files issue still open for OpenAI Codex

(github.com)

116 points | by pikseladam 4 hours ago

25 comments

TheDong 3 hours ago
You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.
If you don't do that, the agent will be able to incidentally upload them. What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.
And so, the only solution is to make it so the codex process is unable to access those files, hence using a container, or unix permissions, or deleting the files. Which you can already do.
I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
[-]
- cowsandmilk 3 hours ago
  100% this. The idea that Codex should enforce this is putting the security boundary at the wrong layer. If you don’t want codes to access something, make it so it doesn’t have access.
  [-]
  - embedding-shape 2 hours ago
    The Codex bug tracker is a great insight into how wide the knowledge gap seem to be between users. The issue where people ask them to add back /undo or whatever it is instead of just learning to use git, probably reached 100 comments at least by now. People seemingly don't really understand the computers they use on a daily basis, and refuse to learn too.
    [-]
    - atomicnumber3 50 minutes ago
      We managed to generate probably-correct code, which can then be probably-corrected recursively to get to something that runs (usually).
      This made everyone scream and lose their minds saying that code is finished, people think they don't need a technical cofounder anymore, think they don't need engineers anymore, etc. Then they're, at varying speeds, finding out they're wrong.
      It seems oddly circular to me that the _exact hubris_ non-engineers have long accused engineers of - and we have indeed been too often guilty of - they themselves turn out to be JUST as guilty of! Just like engineers thought all sales did was bother people, and all marketing did was send emails, and all support did was tell people to turn it off and on again, and all product did was copy google... they all apparently thought all engineers did was tik-tak-click-clack type code all day and when it compiled it was done. Not knowing how much higher-order... well, engineering, there is to it.
      Where are all the CTOs during all of this? I thought someone was supposed to be sticking up for their org? Sales, marketing, etc all seem to have entrenched C-suite people keeping their fiefdoms resistant to erosion by outsourcing, downsizing, etc. But all our CTOs seems to have collectively thrown us to the wolves.
    - tern 1 hour ago
      I suspect most people don't even know there's a there there.
      For instance, while I now know that file systems have permissions, before I became a programmer, I spent maybe ten years thinking of permissions as a special, obscure system thing that you should never touch.
      For that matter, I suspect many people don't know basic things like that a file system isn't inherently the operating system.
      And, where would you go to learn this information? Your Mac doesn't ship with a manual—how would you know one exists? Furthermore, I would wager that perhaps most people have never learned how anything works requiring a manual and are simply unaware that that's a thing.
      All to say, I'm not sure "refusal" is the right term.
      [-]
      - tingletech 48 minutes ago
        When I was an undergraduate biology student in 1991 a suitemate told me I should go to some desk in some building over by Muir and get an account on the VAX. There were strange rooms all over campus that were open 24/7 and were loaded with green and amber screen terminals with integrated keyboards. Lots of sessions for CS lectures were held in these rooms and there was always interesting notes on the white boards (most rooms still had black boards or green boards, but think the chalk was too dusty so these rooms usually had the white boards.
        Once I saw an instruction that was circled with an arrow pointing to is that said:
        man man man -k -or- apropos
        and that was how I learned about computers.
        I just typed `man man` in a terminal on my Mac, and luckily its still there.
    - fragmede 15 minutes ago
      The knowledge gap is very real. Because unsavvy users are just going to paste the API key into codex and say "make it work". For the truly lazy/uninformed, codex has computer use, and are going to tell it go into Vercel/Netlify/Stripe/Cloudflare for them, and get the API key, and save it to .env for them. So users knowing they need such a feature in the first place should be celebrated when the alternative is even dumber.
    - LtWorf 39 minutes ago
      That's the product that is being sold here… why shame the users for expecting what was marketed to them?
  - MattDamonSpace 2 hours ago
    Not sure I agree?
    It’s not like gitignore should be independent from git
    [-]
    - TheDong 2 hours ago
      The difference is that git is a traditional programming tool which executes deterministically.
      agents are not deterministic tools, they're not sandboxes or container runtimes or languages with capabilities models.
      They're a way to run arbitrary commands.
      It would be like saying that "xterm" should have a ".xtermnoexec" list of commands you can't run, or that VLC should have an option for actors it won't show.
      terminals run shells which run commands, it's not really deeply aware of what commands your shell ultimately run, and it's not in xterm's job to setup a sandbox and strip out executables.
      VLC displays pixels, it's not up to it to figure out if those pixels are a certain actor.
      codex pipes text and tool calls back and forth between OpenAI's servers, and it barely understands what that text and those tool calls are, and especially if a given tool touched a file. If you want VLC to not display an actor, you need to add a layer on top of VLC to stop it displaying a list of movies. If you want codex to not display a file's contents, you need a layer on top of codex to prevent it going near that file.
      [-]
      - dns_snek 4 minutes ago
        [delayed]
      - SoftTalker 1 hour ago
        bash actually has a "restricted" mode which is sort of like that. In restricted mode, the following are disallowed:
        - Changing directories with cd.
        - Setting or unsetting the values of SHELL, PATH, HISTFILE, ENV, or BASH_ENV.
        - Specifying command names containing /.
        - Importing function definitions from the shell environment at startup.
        - Parsing the values of BASHOPTS and SHELLOPTS from the shell environment at startup.
        ... some other things mainly preventing you from escaping or disabling the restricted mode.
        [-]
        8organicbits 1 hour ago
        Does that work? I've never seen it used. It seems easy to escape.
        The docs seem to suggest using alternate approaches.
        > Modern systems provide more secure ways to implement a restricted environment, such as jails, zones, or containers.
        https://www.gnu.org/software/bash/manual/html_node/The-Restr...
        [-]
        SoftTalker 1 hour ago
        I don't think I've ever seen it used. I think the idea was back in the day when you wanted to let a user have a shell login (because that's the only way you could use a shared computer) but wanted to confine them to a specific directory and prevent them running anything that wasn't in the pre-defined PATH that you set for them.
    - jxf 2 hours ago
      .gitignore doesn't have the same security implications.
      If you fail to prevent a private key from being added to your repository, you can reverse this and purge it from the blobs and reflog as if it never happened.
      If you fail to prevent OpenAI from ingesting a private key, you have created a security incident.
  - londons_explore 2 hours ago
    I could imagine perhaps some system which rather than denying access might instead replace the key material from your .env key with "** redacted. This key material can be used via make, but can never be exfoltrated directly **" whenever that key is seen heading out towards the network...
    [-]
    - brookst 2 hours ago
      But that means the process can’t use the key for network requests, right?
    - mcintyre1994 2 hours ago
      OnePassword can do something like this where you put references to a path there instead of the key material, and then you wrap the invoke command with their CLI and it replaces them. So your local env file never has anything sensitive. A malicious agent could still exfiltrate if you give it access to debug tools on the running code though.
  - jgalt212 2 hours ago
    I'm a fan of belt and suspenders.
- lelandfe 3 hours ago
  Just be aware that AI agents will explore alternate means of accessing said files: https://news.ycombinator.com/item?id=48348578
  [-]
  - martylamb 2 hours ago
    Yes. I found this quickly after wrapping codex in a launcher that uses bubblewrap to exclude certain files and directories based on a config file at the project root. My best solution so far is to also include instructions for the agent that explain that it is not allowed to see certain files, and that their inaccessibility is not an error, and that it must not attempt to access them through other means (e.g. via git history, etc.).
    This has been a major improvement, but it's not foolproof.
  - cowsandmilk 3 hours ago
    If you’re already running codex as a different user to limit its file permissions, why would you add it to the docker group?
    [-]
    - lelandfe 3 hours ago
      A good but altogether separate note from the point I’m making: this lack of access is seen as an obstacle to overcome, and other means of access will be tried if available.
      It’s a different mental model than a first party solution to “ignore” files.
      [-]
      - TheDong 2 hours ago
        Weirdly, the existing first party solutions around denying commands don't seem to help here.
        Often enough, when one of the agents prompts for running "sudo", and I reject it, it will do what looks very much like malicious exploration to figure out how to handle things anyway, including once hijacking a separate shell's pty where I did have a valid sudo session already in order to execute some commands.
        We don't yet have the capability to make these models behave in a consistent, deterministic, or safe manner yet, so a first party solution isn't even necessarily that much better. Especially if it gives a false sense of security.
    - jen20 2 hours ago
      Lack of knowledge and the desire to have it run containers for things.
  - amelius 3 hours ago
    Yes. Any sane IT department would not allow external AI services, only local ones. It is just too easy for your company's data to end up on the wrong servers. If not through faulty file permissions, then through employees who simply post company ideas.
    [-]
    - brookst 2 hours ago
      Or just have a corporate contract that provides assurances.
      Though really I’m skeptical that much corporate info is secret for competitive or privacy reasons.
      Mostly it seems to be for liability / discovery reasons. Which are still legit of course, but ideas are a dime a dozen and every company has more than they know what to do with. It’s the resourcing and execution that are hard.
      [-]
      - amelius 2 hours ago
        > Or just have a corporate contract that provides assurances.
        After the massive copyright infringements and recent "who care's about the law anyway" stance of corporate America, trusting this could be a grand mistake.
        [-]
        brookst 1 hour ago
        It’s a risk. But odds are the upsides from the legal settlements would far outweigh the losses from your super secret memos about q3 budget planning being trained on.
        Just treat it like a contract worker. They may violate their NDA. That doesn’t mean you never use any for any purpose ever. It’s a risk that’s been managed since before computers.
    - SoftTalker 1 hour ago
      Yet many use public github, and human developers accidently push secrets and other "not for public" files all the time.
- chriddyp 39 minutes ago
  While this is true, there is also a layer in the harness between the output of _any_ tool output (eg stdout or hand-rolled tools) and the LLM. A tool could read the file but then the agentic harness could redact the output before returning it back to the llm if any of the contents matched the file contents. We do something similar in Plotly Studio where we check the entropy of strings in the user input and flag & redact any high entropy strings to the user as “potential credentials” thay the user might have inadvertently copied and pasted into the prompt before sending to the llm.
  There are ways around this - the llm can always be clever by invoking tools to read the file contents in a different way than the direct file contents - but this is all to say that the agentic harness layer _does_ allow for deterministic logic in between tool output and the LLM requests.
- jrvarela56 2 hours ago
  Sandboxing is a solved problem, there are dozens of providers of firecracker instances to run your agent in.
  The problem to be solved is how do you define task-specific least privilege versions of your coding agent.
  [-]
  - sheremetyev 18 minutes ago
    I'm running Codex/Claude in native macOS sandbox with access just to the project folder (plus read-only access to Git repo), and expand to other folders if necessary - https://github.com/sheremetyev/sandfence
- kstenerud 2 hours ago
  If you're not sandboxing your agent, everything on your computer is waiting to be exposed.
  Assuming that file permissions will save you is naively dangerous.
  [-]
  - nativeit 1 hour ago
    It seems insane to me that so many people are OK with this. Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all? Particularly if my agents can’t remember anything beyond a single session, why should the data exist permanently anywhere but in its original location?
    [-]
    - jstanley 1 hour ago
      > Why is it necessary for an agent to upload every bit of data it sees to OpenAI at all?
      The LLM is running at OpenAI. The agent doesn't see anything that doesn't get sent to OpenAI.
      It's like running a compiler in the cloud and asking why you need to send your source code to it when you only want the binary to be on your local PC. It's because that's where the processing is going on and it can't process what it can't see.
      > why should the data exist permanently anywhere but in its original location?
      Sure, they don't necessarily have to retain it permanently.
- nicce 3 hours ago
  > I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
  Also, why would they add a feature to prevent data collection, if the data makes the company even more valuable and you might even get good deals from the current government if you provide the access for this data?
- FergusArgyll 3 hours ago
  Yes, this was solved decades ago. How do you stop a human from reading one of your files?
```
  chmod 600
```
  [-]
  - re-thc 2 hours ago
    > How do you stop a human from reading one of your files?
    Call the police!
- mohsen1 2 hours ago
  [dead]
kennethops 10 minutes ago
These tools are data collection mechanisms to help train these better models. I'm working with some folks to figure out a way to put a layer between the harness and the models to have better control of what data gets sent to and from the model itself and the harness.
nikhilsimha 1 hour ago
Files that codex and any other coding agent has access to, should be opt-in NOT opt-out. I think codex is not the right layer to solve this if you want a sane(one-click) UX. We built our own internal sandboxing-terminal around claude and codex. Where a user-configured base-folder with low-risk code and creds is COPIED into the sandbox BEFORE new session creation. There were many other UX related reasons to build our own terminal. Can share more if anyone is interested.
[-]
- schipperai 10 minutes ago
  Do I understand correctly that you scope least-privilege creds/tokens and pass those to the sandbox? I'd be curious to learn more
petcat 3 hours ago
Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced?
People just need to learn how to use the tools their system already provides them. i.e., chmod
[-]
- wodenokoto 3 hours ago
  The whole point of using an agent is that I don't want to learn everything. I fully expected the harness to read the .agentignore file and do what is needed to hide it from the LLM.
  But apparently, even if implemented, that's not how it works!
  [-]
  - KHRZ 3 hours ago
    How would it prevent an agent from writing a script that discovers the secret file? It's not magic.
    [-]
    - tomrod 2 hours ago
      It can't. As others pointed out, its the wrong layer to implement the security feature. The agent needs to operate in an isolated user / container.
throwaway_19sz 10 minutes ago
I saw someone posted on social media that Claude or Codex had somehow completed an ‘impossible’ task by accessing something it shouldn’t have had access to. When asked how it had done it, it cheerfully explained it had used a docker-based privilege escalation exploit to get sudo. The two lessons here are: 1) always go to the effort of securing Docker properly, which almost no one bothers to do, and 2) never trust an LLM to prioritise your security preferences over its urge to complete its task. The idea of just telling it “pretty please don’t read this file” is dumb. You need to actually prevent it having access. You do this with users and groups.
Something like git can be trusted because it is deterministic code that you can be confident will observe .gitignore. An LLM has heat and will sometimes try to go the extra mile. Or indeed, it may encounter something that hijacks it.
agentdev001 3 hours ago
Sounds like user error to me. Codex gives an llm a tool to allow it to use shell in the context of the host and user in which it is running. If a resource is sensitive, and accessible in that context, then the user is doing something wrong. Would you change your practices if you treated your coding agent as an untrusted human ssh'd under the identity you use for it?
In any case. There are solutions in the comments on the issue, as well as this hn thread.
mbid 2 hours ago
I recently got the tool I use to orchestrate agents in (remote/secure) devcontainers open-sourced at work to solve this properly: https://github.com/nvidia/rumpelpod
As others here have pointed out, it's exceedingly unlikely that a blocklist like proposed in the issue would ever be complete. You shouldn't allow agents direct yolo-access to your machine if it has sensitive data.
Codex works particularly well as a remote agent harness because of its client-server architecture: The server component runs in the container, which might be remote, while the client runs locally. So, in contrast to e.g. the claude cli where the frontend also runs remotely, there's no lag when you write/edit prompts.
[-]
- noveltyaccount 1 hour ago
  I agree a block list won't work. And unix file permissions may not be enough; I once saw Codex 5.4 use docker to execute a command as root since it couldn't run sudo. Running in a container may be the only solution:
  > sudo needs an interactive password here, so I'll use Docker itself to prepare the bind-mount directory as root and hand ownership back to UID/GID 1000. That keeps the compose file's non-root runtime intact.
  > Ran `docker run --rm -v /shares:/shares alpine:3.20 sh -c 'mkdir -p /shares/local-llm/models && chown 1000:1000 /shar...`
- jofzar 2 hours ago
  Neat tool! Will have to check it out
  Edit: would love a couple of pictures/video of how you use it. I kind of get the idea, but it seems like more hassle then it would be worth?
  Your comment of codex makes it seem like I might be missing something tho.
  [-]
  - mbid 2 hours ago
    Yeah I should add a video to the README.
    Have you tried running `rumpel codex foo123` in one of your repositories, asking it to commit something, then `rumpel merge foo123` to get the changes back to your local checkout? Use a different terminal for the merge command, or detach from the codex session with `ctrl-a d`. You can also look at the commit first with `rumpel review foo123`, or get a shell inside the agent environment via `rumpel enter foo123`.
planb 3 hours ago
Sound like snake oil. How would this work? The app that the agent is developing needs access to the file, so access to it cannot be blocked. Just because read_file can not access it (I think current harnesses prevent reading .env files already), does not mean the contents will never be seen by the model.
kstenerud 3 hours ago
.agentsignore is NOT a security tool.
It's a good idea as a hint to agents about what files it should ignore (because they'd be of no value and only chew up tokens).
However, using it to prevent exposure of secrets would be a BIG mistake. There's simply no way to guarantee that an agent will ignore things in the ignore file. And even a harness-enforced restriction would still be in-process, which a rogue agent could trivially compromise. For security, use a sandbox. Nothing else will do.
I do AI sandboxes (FOSS, free forever, no rug pull): https://github.com/kstenerud/yoloai
bob1029 2 hours ago
The only thing close to a guarantee is to give the agent exclusive access to a clean VM with precisely the information and permissions you want it to have.
I've been looking into a "workspace" concept that involves an entire cloud VM being spun up as part of an agent conversation such that code changes can be iterated without touching the user's local machine or other trusted contexts. All the agent's tools only have effect when supplied with a specific workspace guid. CLI tools like git are not authorized to talk to the remotes in this arrangement. The machine is initialized with a clone and no way to talk to origin. There are dedicated methods in the harness that can reach into the VM and pull out a change set for deterministic PR generation in the secure contexts (e.g. when the agent calls "ReadyForReview" or similar).
[-]
- binsquare 34 minutes ago
  I made a lightweight vm specifically for this use case: https://github.com/smol-machines/smolvm
- TZubiri 53 minutes ago
  Sounds overkill, how about giving the agent its own user?
mixedbit 2 hours ago
I work on a Linux sandbox that makes it easy to hide sensitive files from AI agents while keeping the files they need accessible. Check it out: https://github.com/wrr/drop
SubiculumCode 44 minutes ago
So how might I restrict the read paths if I am running codex as a plugin in vscode?
kardos 1 hour ago
A solution to this is apptainer: you configure it to not see any of the host files by default, and mount the repo you want to work on at runtime.
[-]
- TZubiri 53 minutes ago
  Question, out of curiosity, do you know how User and Permissions work?
ZiiS 3 hours ago
However clever/stupid you believe LLMs are they are extremely capable of working around these sorts of restrictions. The ask is for .env files for whatever code you are writing so if the code it writes dosn't have access (i.e. filesystem/container) what is the point, if the code under development reads the env how dose codex debug it without accedentally reading the values from memory? Adding a security setting that dosn't work is much worse then not having one.
pohl 3 hours ago
This should be an open standard like AGENTS.md or skills. What do other harnesses do?
[-]
- ampersandwhich 3 hours ago
  I believe JetBrains products like Junie use the neutral term .aiignore for this funtionality.
hoppp 2 hours ago
Do not store secrets in the repository in files, but inject them during runtime. Then the agents have no way to access them.
[-]
- tiew9Vii 2 hours ago
  A lot of people have secrets/config files in the projects working directory but ignored by git i.e. `.env.local`
  So they're following best practice, not committing secrets but agents running locally can still see them even if sandboxing to the working directory.
  I've taken to storing configs using XDG_CONFIG_HOME and have the app auto resolve them by convention or take a cli arg to specify the config path. All secrets are in files, not env vars.
  That way when using sandboxing the agent can never see the configs or secrets as outside the working directory.
  [-]
  - hoppp 2 hours ago
    Sounds like a good way to do it.
    Makes me think of docker secret where the secrets are exposed as files and accessable only from inside the container.
    If the development environment uses docker then thats a solution too I guess
    [-]
    - SoftTalker 1 hour ago
      If you let your agent use docker you've basically given it root on your machine.
      [-]
      - hoppp 48 minutes ago
        I use podman btw
        Its aliased to docker
        Building a project as a container and giving an agent access to running docker commands are different things.
Lucasoato 2 hours ago
There should be a standard around .agentignore file similarly to what happens with .gitignore file. Of course this could still be workarounded by agent bash command tools, but at least basic operations like reading and so on should be checked and prevented.
edg5000 2 hours ago
Bind mounts can work fine. Setting them up does require root though. Easiest would be if the harness offered to enable containment. Awkwardly, it would require root.
[-]
- ptspts 2 hours ago
  In fact, it's possible to set up bind mounts without root on a modern Linux system, using a user namespace and a mount namespace.
  [-]
  - apitman 1 hour ago
    podman is my favorite tool for this.
TZubiri 1 hour ago
Out of scope, learn cybersecurity. A simple concept such as users and permissions solves this problem.
Regardless of what technique you use, you need a deputy, you wouldn't ask an employee not to go into the vault, right? You would lock the vault. Well you can ask the employee not to go into the vault, and you can also ask codex not to use certain files, but if you need more certainty, you need to it outside.
The issue seems to be that people want to ask their agent to do everything, they want the agent to lock themselves out of some system, they want the agent to install itself, they want the agent to write their prompts so they don't have to write them. At some point there's some things YOU have to do, and you have to DO them.
eduction 51 minutes ago
Great example of why operating systems should be stealing more ideas from Qubes, the OS where everything runs in a vm.
Qubes is not practical for mobile laptop use and non expert users.
BUT it would be very practical for other OSes to offer the option of VM-style isolated containers as first class objects that are easy to make and configure boundaries on, and for which first class interop facilities are provided (eg “send this file to this container” “send the clipboard to this container’s clipboard).
cowpig 3 hours ago
I don't think we should ask the agent runtime to police itself.
I contributed to a tool for this problem that is lower-friction than traditional sandboxing:
greywall.io
But you should use something to contain an agent runtime. The idea that people run things like codex on their machines with regular user permissions is baffling to me.
pikseladam 4 hours ago
it has been a year and still it is not resolved
[-]
- pamcake 3 hours ago
  It's not their problem to solve. Don't give it access to sensitive files on the first place.
swordlucky666 2 hours ago
[dead]
iluvcommunism 3 hours ago
[dead]