Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Parsing the .DS_Store File Format (2018) (0day.work)
150 points by popcalc on March 28, 2023 | hide | past | favorite | 86 comments


>>Apple's operating system creates this file in apparently all directories to store meta information about its contents. In fact, it contains the names of all files (and also directories) in that folder.

Anyone have an idea why modern versions of MacOS still uses a .DS_Store file? The OS could just as easily read and display the contents of a directory. And deleting a .DS_Store file doesn't seem to have any noticeable negative affect.


.DS_Store is not part of HFS+ or APFS. It's a kludge to give the OSX Finder a place to store its own metadata.

MacOS prior to X had much more elegant solutions to the problem. To this day, when you switch to/from icon/list view in the Finder, more windows than just the current one switch. In MacOS 9 that loosey-goosey unpredictable behavior never happened.


> To this day, when you switch to/from icon/list view in the Finder, more windows than just the current one switch

And there's almost no way to make this behavior stick because... it's different if you click through folders and if you click on pinned folders in Finder sidebar.

I mean... How?!


Don't even get me started on the "Upload" dialog that can't remember that I want it to show dates and sort by them. It resets every time to the worst possible view - list, sorted alphabetically. So annoying. All other desktop OSes I've used got the upload windows remembering their settings, but not MacOS. Maybe the technology is just not there yet.


The file is used for storing layouts (show as icons/list/grid etc.) and various settings (sorting, grouping etc.).

Storing a local file inside the directory allows for moving that directory around while also keeping those settings.

The file is invisible for users of Finder so it is still the best solution for this task.

I have a .DS_Store line in a global .gitignore and don't notice it much but when I'm annoyed by DS_Store files in a specific directory, I usually bulk delete them using `fd`:

    fd -u --type file --fixed-strings .DS_Store $HOME/Projects/some-file-server-project -x echo rm {}


i'm surprised that Apple doesn't do directory-level metadata directly in the filesystem itself, e.g. with xattrs on the directory inode.


It does store that metadata in the directing in Mac OS native file systems. The DS files are only for non-Mac file systems like FAT32.


This is not true. My APFS volumes are absolutely littered with .DS_Stores. They're generated every time you open a directory in the Finder.


“Directory” not “Directing”


> Anyone have an idea why modern versions of MacOS still uses a .DS_Store file?

The Wikipedia article is a pretty good resource for this: https://en.wikipedia.org/wiki/.DS_Store


>Anyone have an idea why modern versions of MacOS still uses a .DS_Store file?

For the exact same reason Windows has a Desktop.ini file?

They store per directory settings for the Mac Finder and Windows Explorer.


I believe if you've dragged some files around in that folder, it stores where those were. Or if you've set the folder to sort by name or date modified, etc.


Yeah, it basically maintains Finder window state.

One thing I've wondered is if this method still makes sense on modern hardware compared to keeping all that information in a centralized sqlite file or something along those lines. I could see individual files being advantageous on a slow mechanical HD where seeking through a file that could eventually grow large might pose issues, but that shouldn't be a problem on an SSD.

The main advantage I can think of that might still be relevant today is that it makes Finder window state machine-independent, so when you open a folder on two different machines it'll still open in list mode sorted by Date Modified or whatever.


It also makes the info follow around if the folder is moved, something that an externally maintained database file will have difficulties doing.


Windows has always maintained the view information in the registry. (https://www.nirsoft.net/utils/shell_bags_view.html)


I'm pretty sure it creates (used to? At least with XP) an ini file to remember the position of files in a folder if you manually arrange them, as well as to save view preferences (list, grid, tree) if you enable it per-directory.

Not sure if it's "Desktop.ini" I'm thinking about, but here's some documentation on it: https://learn.microsoft.com/en-us/windows/win32/shell/how-to...


I think Windows 7 (or thereabouts) ditched the Thumbs.db thumbnail cache file for each directory for a centralised file. Not sure what the reasoning was.


I've noticed that the .DS_Store file doesn't appear in a directory until I've clicked the triangle in a list view to see the contents of a subfolder. Then it creates one, I assume, to keep track of which subfolders are open.


Correct. If you create a new directory in the Terminal and populate it, it won't have a .DS_Store file. The Finder creates .DS_Store when you view the directory in the Finder.


Right. I would assume custom icons and labels (do those still e its?).

Basically it seems to be the replacement for the old resource fork data on non-Apple file systems.


I think you're right that .DS_Store handles labels, but custom icons are done with a second invisible file in the folder titled "Icon?" which apparently stores the icon data in its resource fork.


Labels and custom icons on files are stored in extended attributes (implemented as named forks on HFS+; not sure about APFS but the OS presents them all as xattrs regardless of FS) on the files themselves in APFS and HFS+.

Custom icons for folders are stored in a special "Icon?" hidden file, but not actually in the file's normal data-- they're in an extended attribute too.

(The xattr in question is called "com.apple.ResourceFork" and I'm presuming is structurally similar to a resource fork in classic Mac OS.)


If you know what a dotfile is, it’s there for the same reason.

(Dotfiles are per-user system configuration options written as a file in the user’s home directory. This means that for any system on which that home directory exists, that user’s configuration options will be applied. If the home directory is shared across multiple computers over a network, or if it is backed up from one computer and restored to another, then that user’s settings will persist.)

If these settings were stored in some other central database then if you copied the macOS folder from one computer to another, your copy tool would have to know how copy the settings as well. Keeping those settings inside the folder itself means that your copy tool only needs to know about copying folders, instead of copying folders and also copying settings.

It’s one of those Unix principles where everything is treated as a file: an elegant technique for a more civilized age.


Curious what creative use cases people have around this bizarre file format. The only reason myself and millions of other people have to deal with this stupid file is adding it to .gitignore as part of the ritual of spinning up a new codebase


nitpick: As it’s a platform and machine specific thing, technically it shouldn’t be in .gitignore but in .git/info/exclude

People working on Linux or Windows do not need to know about garbage your dev env leaves on your machine.

I’m saying this half-seriously as I fully understand that these are so common, it’s more convenient to have the exclusion synced between clones.

That said, as a purist, none of the repos I’m watching over have .DS_Store in their gitignore


I think this depends on how you interpret it. All the docs[0] say are:

> Patterns which should be version-controlled and distributed to other repositories via clone (i.e., files that all developers will want to ignore) should go into a .gitignore file.

and

> Patterns which are specific to a particular repository but which do not need to be shared with other related repositories (e.g., auxiliary files that live inside the repository but are specific to one user’s workflow) should go into the $GIT_DIR/info/exclude file.

To me it seems obvious that all developers would want to ignore .DS_Store even if only some of them would generate it. And certainly you want it distributed to other mac users when they clone. Further, it's certainly not the case that .DS_Store is specific to just one user's workflow. On all three cases therefore gitignore would be reasonably appropriate.

[0]: https://git-scm.com/docs/gitignore


Interesting, I guess I live in impur lands :)

I understand the pollution concern… but unless that .git/info/exclude is standard in the company, what prevents a intern to not have a correct exclusion there, and happily push a .ds file ?

.gitignore looks more robust to the obvious user. And it’s not like it’s a file you need to look at attentively every day.

But yes, I do see that you are correct!


> what prevents a intern to not have a correct exclusion there, and happily push a .ds file ?

Our CI.

And code review is what prevents .DS_Store to be added to .gitignore.

And, again, I'm saying this as a purist with a developer team of 99% Mac users.


If anyone on my team even dared to waste time on this discussion, let alone add CI checks or instruct the team about putting the exclude from A to B, we’d had a serious conversation about generating business value, cargo culting, and the purpose of code in general. Fascinating.


Some of the repos we work on date back to 2004 (going from CSV to Subversion, to git. Developers moving from mostly Windows to mostly Linux, to mostly macOS). If everybody was free to check in debris of whatever IDE and/or OS they were working at any given time, the codebase would be a terrible mess, especially as such debris tends to go unnoticed for ages until it's not.

Just like we have CI checks to make sure nobody accidentally commits a secret (like the GitHub host key thing last week) we have checks that prevent debris to be committed and code to be formatted according to agreed-upon coding standards.

All of this might seem superfluous when the live expectancy of a repo is measured in months or single digit years, but then, no solution or repo is more permanent than a quick throw-away one.

Which is why this isn't even a discussion but just a reality. Been there, done that, learned my learnings.


You're moving the goal posts here. We're not discussing not checking in debris like `.DS_Store` files; I'm totally on board with that, but that's covered in any .gitignore template generated by my IDE.

Instead, you appear to be enforcing in which arbitrary location to place ignore rules for debris, seemingly having spent considerable time implementing that, for entirely puritanical reasons.

And while you're free to play holier-than-thou at your job as you like, I'd give hell to anyone wasting my team's time like that.


There’s also an argument for having every exclusion centralised in one config file (ie gitignore) with regards to making it easier to review what exclusions are active.

As a DevOps guy (but with 30 years of experience in development too), one of my pet peeves is having to deal with a thousand different edge cases because someone decided that intellectual cleverness was more important than a holistic approach to design and architecture.


> Instead, you appear to be enforcing in which arbitrary location to place ignore rules for debris, seemingly having spent considerable time implementing that, for entirely puritanical reasons.

They're "puritanical reasons" right until you have to migrate to a new version control system, at which point they're the rules that all migration tools enforce -- because those are usually written against the two VCS' specs and current implementations, rather than the myriad of alternate practices that software shops everywhere devise.

I've done migrations like these before -- while the parent poster may be doing all that for the wrong reasons (puritanism) they are absolutely right to do them.


So, just to be clear, when migrating from git to a hypothetical new VCS, you're saying that it will be beneficial to have exclusion rules for some files in the repository, and for others locally on every developer's machine, hopefully?

The "myriad of alternate practices" that you mention are a strawman: We're still talking about ignoring some file manager metadata file. Developers put a line into their .gitignore and be done with it. If that breaks your new VCS, maybe migrating to it isn't such a good idea?


> So, just to be clear, when migrating from git to a hypothetical new VCS, you're saying that it will be beneficial to have exclusion rules for some files in the repository, and for others locally on every developer's machine, hopefully?

Not some arbitrary files, but the ones that the VCS recommends, or at least allows, to be excluded locally, where it makes sense. See e.g. github' note: https://docs.github.com/en/get-started/getting-started-with-... on the matter.

Also not a hypothetical new VCS, I've seen this backfiring several times, and it's usually not related to how good the new VCS is. E.g. when doing a ClearCase -> SVN migration years ago, just cutting out a few global exclusion rules that should have been local (e.g. tags files for people who liked etags, cscope.files for people who liked cscope) reduced a full-history migration time from several days to a few hours.

The two tools did not treat filename encoding the same way, so the migration tool had to walk through the exclusion list at every history point it synchronized. Due to how ClearCase presented its repositories (tl;dr userspace filesystem, years before FUSE) this was very slow on its first Linux versions (it had originally been offered for commercial Unices), not so much because ClearCase sucked but because it exposed a nasty quirk in the kernel's VFS implementation.

> The "myriad of alternate practices" that you mention are a strawman: We're still talking about ignoring some file manager metadata file. Developers put a line into their .gitignore and be done with it.

We're talking about ignoring file manager metadata files specific to some developers on some machines. .gitignore is global. Some details matter.

If you don't expect to ever do cross-platform migrations -- of even much cross-platform development, for that matter -- for repositories maintained across multiple platforms, yes, sure, you don't need that. That doesn't mean OP has a twisted view of adding value. Maybe his team does need that.

Edit: FWIW, migrations are just the nasty point that bites you back years later and usually comes with a huge bill. But puting all local exclusion rules in a single global file backfires in all sorts of ways on large and/or long-lived codebases. Sooner or later some eclusion rule specific to one developer's environment will do the wrong thing in another developer's environment.


Secrets checks should be in git as well because if you’re only checking in CI then your secret has already been published.


This is why it’s important to do a culture fit evaluation on top of a technical evaluation. An unpopular opinion here for sure.


I think a single line in the repo's .gitignore file hurts nobody, while adding CI checks, or catching it in a code review just wastes everyone's time. Expecting everyone to configure their repos precisely is asking too much, IMO.

I've seen all kinds of filters in a .gitignore for programs I don't personally use. I don't mind it at all.


Or you could just set it once per developer in their core.excludesFile and have it apply to all repos.


Let's see. Either we spend 8 seconds adding it to the project .gitignore, once in the lifetime of the project; or we spend 15 minutes instructing everyone on the team, plus once for every new hire, so they can spend another 30 seconds on modifying their local configuration.

I'm pretty sure I'll never create so many new projects in my entire career that those 8 seconds would add up to the amount of time required to follow your suggestion. Be a little more pragmatic, folks!


So now you leak that kludge into CI, and waste code review cycles, when it could have just been in .gitignore


By that logic it should be in your global excludes file. It practice it will save more time if you just add it to your project's .gitignore instead of wasting time having everyone else configure their system. O(1) amount of work compared to O(n)


yes. it's in-fact there. But that's one more level of indirection from the "problem" at hand (because the global excludes file needs a config setting for it to even be considered, whereas .git/info/exclude is present in every repo)


> because the global excludes file needs a config setting for it to even be considered

Using `~/.config/git/ignore` requires no extra setting.


Oh. TIL. Thank you


As long as that isn't set up by default, you can't rely on that being setup correctly for everyone or even most people. So .gitignore should include the file regardless of that setting.


I agree that platform-specific files shouldn't be in .gitignore, but it does make sense to put .* there and then only allow specific .files that you actually want to check in.


Also there is a preference you can set with the defaults command to stop MacOS from putting them on file systems it encounters.


An informative nitpick, thank you.


Interesting! The kind of answer ChatGPT can not produce (afaik) because it is too far from the mean mode of how the world is operating


> The only reason myself and millions of other people have to deal with this stupid file is adding it to .gitignore as part of the ritual of spinning up a new codebase

I stole this from somewhere: Create a .gitignore_global file and include .DS_Store, then in the [core] of your .gitconfig include excludesfile = /Users/janedoe/.gitignore_global.


Arguably if you work on a mac and you use git regularly then just add it to ~/.config/git/ignore, and it will be ignored everywhere.


You can ignore it in your global git config, too. Once per host.


Something like:

git config --global core.excludesfile ~/.gitignore-global echo .DS_Store >> ~/.gitignore-global


You don't need to set core.excludesfile any more, it defaults to ~/.config/git/ignore these days


I instinctively delete these .DS_Store files whenever I see them but your "creative uses" comment makes me think given common .gitignore practice this might be a great place for AWS keys or whatever you never want checked into VCS /s


Having .DS_Store all over the place sucks, but to be fair so does having other dot files, such as .git which is an entire subdir even. At least the convention of hiding dot files from file listings is uniformly observed.


The difference is that .git is only in directories that you explicitly set up as a git repository whereas .DS_Store, Thumbs.db, .desktop and similar get barfed wherever by the file manager without doing anything that a user would consider as a modification to the directory.


It's not exactly the same thing, but Windows used to have a thumb.db to store, well, thumbnails, in each folder that has thumbnail-able files.

I hated it, most of people hated it. MS finally got rid of it (in Vista or 7) and store thumbnails in a centralized place.

However, I now start to miss it.

I store lots of large image files (think 1200 dpi PNG scans) on my computer, and it took Windows years to create thumbnail for them, while locking the entire hard disk (and sometimes the whole explorer.exe) in the meantime. This process, while obviously is much slower on spinning HDD, it surprisingly isn't really that fast on NVME SSD, either.

This would be fine if it's a one-time process, but unfortunately the aforementioned centralized thumbnail cache either have an expiration time, or have a size limit (or both), so the reality is if I didn't visit a folder for awhile, its thumbnails are all gone, so they have to be generated again, make the explorer unresponsive for some time again.

I have been annoyed by this for long time but I don't have any solution. The best workaround is to only view such folders in list/detail mode, and rely on 3rd party image manager (I use XnView) to show the preview/thumbnail -- which has a much faster thumbnailing speed and more robust thumbnail cache management.


> This process, while obviously is much slower on spinning HDD, it surprisingly isn't really that fast on NVME SSD, either.

Windows is renowned for having awful performance in usecases involving accessing many small files even for read-only applications. I know a team that even went as far as testing switching away from Windows to build Windows apps and instead switched to cross-builds on Linux just to avoid that performance sinkhole. Meanwhile I worked on a multiplatform project where macOS builds on a Mac mini for the exact same project took less than half the time to build than on a beefed up Windows workstation.

Here's an interesting discussion on HN from a couple of years ago on this topic.

https://news.ycombinator.com/item?id=18783525


It doesn't help that corporate Windows machines are infested with AV snakeoil scanners slowing down FS access.


It took me forever to figure out that windows defender was what was slowing emacs down to a crawl on windows. I even went through the trouble of compiling native-comp and it was still slow.


Related: If you change the folder icon, Finder will store an "Icon?" file in that folder.

More info here: https://superuser.com/a/298798

My question is: Linux and Mac both have the ability to store key-value pairs on files/folders with xattr. Why did or does Apple continue to use .DS_Store?


Good question. Possibly because that approach wouldn't work on readonly files. (But in the case of readonly directories .DS_Store wouldn't work either).

Possibly because not all copy programs copy metadata.

Possibly it's just technical debt: .DS_Store was added to MacOS before xattr and rewriting the Finder to use xattr is not high priority.


Windows too with NTFS attributes. The answer is baseline FAT compatibility.

But at the very least, they could set the FAT hidden attribute on this stuff. It must be some mentality like "Well, I'm not littering in my own yard?"


This article makes me wonder: what tools do you use when reversing a custom file format?

Personally, having done so recently to reverse a custom game archive format, I have only used the usual hexdump, strings, grep, sed, etc. I'm wondering if there are more powerful tools to do this kind of job?


You usually want something for which you can define a structure and parse the file with it. There are a bunch of commercial and free tools, I've used the following with success in the past:

- 010 Editor (commercial)

- Kaitai Struct (free & open source)

Also has the advantage of directly generating parsers from the defined structures.



Is there any chance for these dropping to go away and be replaced by a proper database of whatever Finder info is in there?


I think the .DS_Store file is in the classic macOS resource file format: https://dev.os9.ca/techpubs/mac/MoreToolbox/MoreToolbox-99.h...


Is there a reliable way to make the os NOT create these `.DS_Store` files? I really don't like them and the fact that they pollute every folder.

I'm on Big Sur (11.7.1) and I didn't find a way to make finder NOT create them.

It's (one of the many) irritating idiosyncrasies of this OS


I don't think there's any way to do that, the files are needed by Finder to store per-folder layouts and settings.

You can however create a cronjob that deletes the files as often as you want, the operation takes 11 seconds on my 1TB SSD with a fairly large home folder:

    fd -u --type file --fixed-strings .DS_Store $HOME -x echo rm {}
obviously, remove the echo to actually delete the files


> I don't think there's any way to do that, the files are needed by Finder to store per-folder layouts and settings.

I've always hated this excuse from software developers (I know it's not your excuse, alin23).

User: How do I stop my computer from doing Unwanted Thing X?

Dev: You can't. Unwanted Thing X is needed so we can do Unwanted Thing Y.

Fucking A! Stop doing unwanted things on my computer! My computer should only be creating files I command it to create. It shouldn't be doing things I don't want it to do simply because it conveniences a developer living 1000 miles away in Silicon Valley. My filesystem is there to store my files not some developer's metadata. It should not contain this trash from my operating system. At least give me a DONT_CREATE_THIS_DS_STORE_CRAP environment variable or System Preference to set!

I'll be generous and grant one root directory tree to the OS for its litter. Linux can have /var, macOS can have /System and maybe /Library too. Windows can have C:/WINDOWS. Everywhere else in my OS, hands off, please!


I don't want finder to store per-folder layouts and settings. I don't even care about the .DS_Store files very much. I just don't want the functionality that they exist to enable.


I don't even use finder and I still find these things popping up every now or then.


I'm not aware of any reliable way, but anecdotally I've found that they seem to be mostly triggered by Finder usage, rather than the OS itself. I've gotten out of the habit of using Finder (instead mostly just use terminal or file-hierarchy UIs in whatever relevant app/IDE), and it's a long time since I've had a .DS_Store issue.


while we're here discussing what files should be pre-emptively added to your web server's deny entry; what else should be in there.. besides, say, .git?

I guess you could block .* except .well-known 'just in case'


Shouldn't you rather deny all, and carefully carve out paths you need to allow?


As a Windows wizard, desktop.ini and thumbs.db both come to mind.


...It's _not_ from the Nintendo DS?


(2018)


Added, thanks


Did you add that? And if so, how?


Yes, a handful of users have the ability to do routine maintenance like add years to titles if there is a reason to do so


Is that because of karma?


Don’t think so; I believe it’s just a set of “trusted” users.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: