eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

201
active users

#filesystems

1 post1 participant0 posts today

Hey Fedi!

I need to copy some TBs of data (incl. large files) from an NTFS drive to SOME_FILE_SYSTEM and then that SOME_FILE_SYSTEM to Ext4. (*)

Is there a value for SOME_FILE_SYSTEM that is better than the other choices?

kthxbai

(*) Copying directly is not an option. The drives are all in the same computer.

Folks who know "rsync -F" because they already use it -- am I right in thinking that it adds these behaviours to a sync:

- recursively look for .rsync-filter files in every directory in the copy source, including the top-level

- apply the filters they each contain to the directory and subdirectories rooted at the same level that each file was found

- exclude those .rsync-filter files from being copied to the destination

Is that right? #rsync #sync #data #sysadmin #filesystem #filesystems

Replied in thread

@cks

The same thought occurred to me. There have been cute tricks like that played. I've even played some of them on other systems.

But as far as I know that first entry in the i-node table was just unused.

I wonder whether fsck even checked it. Modern FreeBSD fsck_ffs does, but that's a very different beast.

Replied in thread

@cks

Off By One Error in your post on this. (-:

Since 0 means a free directory entry, that's only 65535 i-nodes.

Mind you, at 8 i-nodes per block, you'd come close to the size of some contemporary discs with a maximally sized i-node table.

Replied in thread

@spacehobo @brouhaha

Yes. Berkeley FFS broke things up into cylinder groups and had an allocation policy of putting various things in the same CG and other things not, because much the same idea was also useful for reducing seeks on discs.

(Even OS/2 got in on the act. (-: Its FAT filesytem driver put the extended attributes record for every file in front of the file data where it could. Which was why it was a bad idea to defragment the "EA DATA. SF" file so that it ended up stuffed all at the start of the volume.)

And the elevator algorithm in the block cache would at least make the tape do straight runs in one direction and then rewind, where it could.

Putting filesystem *and* swap on tape boggles the mind, though. (-:

Picked backed up the work for VFS {g,u}id squashing. IOW, mapping all {g,u}ids down to a single {g,u}id. Any process that doesn't have that {g,u}id but is still privileged otherwise will write to disk as the squashed {g,u}id. I just finished a draft and selftests that miraculously work.

web.git.kernel.org/pub/scm/lin

Probably "bugs galore" at this point. Needs more thinking.

web.git.kernel.orgkernel/git/vfs/vfs.git - VFS tree
#linux#kernel#vfs

One of the main criticisms I read about ZFS (mainly OpenZFS) in forums and articles is that "it's not well integrated into Linux."
It's true - there is a licensing issue, and that shouldn't be underestimated. However, I believe it's wrong to judge it based on this - on FreeBSD, it is perfectly integrated (not to mention the various illumos-based OSes), and in my opinion, it should be judged for what it is, not for its integration into the different Linux distributions.

Surely someone's looked into this: if I wanted to store millions or billions of files on a filesystem, I wouldn't store them in one single subdirectory / folder. I'd split them up into nested folders, so each folder held, say, 100 or 1000 or n files or folders. What's the optimum n for filesystems, for performance or space?
I've idly pondered how to experimentally gather some crude statistics, but it feels like I'm just forgetting to search some obvious keywords.
#BillionFileFS #linux #filesystems #optimization #benchmarking

HDD SSD space should be counted in binary.

1KB in binary is 1024.

A 32 TB hard drive is in fact 30.517578125 TB unpartitioned /unformatted capacity, as the binary system on the computer actually uses it

I know about all those confusing terms that you can find when you go and search on different engines; those are just to confuse and convolute the fact that drives sold are under capacity

Counting storage in decimals is a crime, a marketing scheme which should have been outlawed globally.

Been reading up a bit trying to decide which file system I want to use when I redo my home server soon. Think I'm leaning towards giving btrfs a go. Curious what the splits are on fedi. I've only ever used ext4 on Linux. I'm guessing for desktop/home server use, xfs isn't very popular. Just including it here since it's in this article.

#Linux #FileSystems #EXT4 #BTRFS #ZFS #XFS

blog.usro.net/2024/10/linux-fi

Ultimate Systems Blog · Linux File Systems Comparison
Replied in thread

@Migueldeicaza @davew,

For years, we’ve offered a personal #DataSpaces platform providing an abstraction layer over data managed by #filesystems (including cloud services) and/or #DBMS.

You can even mount #Dropbox, #OneDrive, #S3, etc., with fine-grained access controls for governance.

In the age of #AI, we’re making this usable via natural language as an extension (e.g., #CustomGPT) for #LLM-based chatbots like #ChatGPT.