# Unlimited Subdirectories



## davidkazuhiro (Dec 15, 2011)

I'm struggling with UFS' 32k subdirectory limit. I know I could go with ext4 which has a higher limit of 64k but is there an option which has no limit? I looked into XFS but FreeBSD doesn't fully support it 

Any filesystem suggestions would be appreciated.


----------



## fluca1978 (Dec 15, 2011)

Zfs?


----------



## davidkazuhiro (Dec 15, 2011)

It's hard to find good information of ZFS... Toby on Yahoo! Answers says ZFS can go up to 2^48 subdirectories but  this IBM article says that the limit is 64k-3.

If ZFS is the best, stable, FreeBSD compatible choice for breaking the 32k limit, is there some good documentation out there which says what the actual limit is?


----------



## mix_room (Dec 15, 2011)

The classical method of reorganizing your directory structure might apply as well.


----------



## fluca1978 (Dec 15, 2011)

mix_room said:
			
		

> The classical method of reorganizing your directory structure might apply as well.



Agree!


----------



## gkontos (Dec 15, 2011)

davidkazuhiro said:
			
		

> It's hard to find good information of ZFS... Toby on Yahoo! Answers says ZFS can go up to 2^48 subdirectories but  this IBM article says that the limit is 64k-3.
> 
> If ZFS is the best, stable, FreeBSD compatible choice for breaking the 32k limit, is there some good documentation out there which says what the actual limit is?



You can read more information about ZFS in Oracle's website.

May I ask what type of environment you have that requires so many subdirectories ? (just plain curiosity)

Regards


----------



## Sylgeist (Dec 15, 2011)

The IBM article linked to is NOT for Oracle's ZFS. It is for z/OS on IBM mainframe hardware. Just in case anyone was confused.


----------



## davidkazuhiro (Dec 16, 2011)

gkontos said:
			
		

> May I ask what type of environment you have that requires so many subdirectories ? (just plain curiosity)



We are storing DeepZoom tiles on FreeBSD to take advantage of the kernel's asynchronous I/O and so that we can serve them with nginx.



			
				mix_room said:
			
		

> The classical method of reorganizing your directory structure might apply as well.



We already did, but we will still hit the limit eventually because DeepZoom creates many subdirectories by design for every file uploaded by our users. So even if we reorganize our directory structure by very fine categories, if a user uploads more than n files (where n = the filesystem's subdirectory limit) within that category, that many subdirectories will be created by DeepZoom in that directory, and thus we reach a breaking point 

So yes I can reduce the risk via the classical way (as I already did) but it still does not provide a guarantee.

Of course unless the subdirectory limit is infinite, merely increasing the limit is practically the same thing as reorganizing the directory structure.




			
				gkontos said:
			
		

> You can read more information about ZFS in Oracle's website.



Thanks gkontos, I'll read up on ZFS there.


----------



## davidkazuhiro (Dec 16, 2011)

OK, the best I cold find from Oracle's documentation is this



> Unparalleled Scalability
> 
> A key design element of the ZFS file system is scalability. The file system itself is 128 bit, allowing for 256 quadrillion zettabytes of storage. All metadata is allocated dynamically, so no need exists to preallocate inodes or otherwise limit the scalability of the file system when it is first created. All the algorithms have been written with scalability in mind. Directories can have up to 248 (256 trillion) entries, and no limit exists on the number of file systems or the number of files that can be contained within a file system.



Source: What is ZFS?

What does "entries" mean here? Only subdirectories? Or everything including files, subdirectories, symbolic links and so forth?


----------



## fluca1978 (Dec 16, 2011)

davidkazuhiro said:
			
		

> We already did, but we will still hit the limit eventually because DeepZoom creates many subdirectories by design for every file uploaded by our users.



Another thing that could help is to split the categories (or the tree structure) amongs different filesystems, but this could not be so easy to implement right now. Or even serve different parts of the tree via different virtual machines...but I admit there is no real fix for what I think is a bad design (of the product).
Surely ZFS will give you breath.


----------



## fnucc (Dec 25, 2011)

I don't know is this problem solved but would it be possible to have a category that consists or more categories? For example, I user is uploading pictures to "Flowers" category, after some number let the files go into "Flowers1", "Flowers2"... internal subdirectories and of course your user will only see "Flowers".


----------



## xibo (Dec 26, 2011)

fnucc said:
			
		

> I don't know is this problem solved but would it be possible to have a category that consists or more categories? For example, I user is uploading pictures to "Flowers" category, after some number let the files go into "Flowers1", "Flowers2"... internal subdirectories and of course your user will only see "Flowers".



However you need the directory contents to get more or less equally distributed. A double-hashing approach to distribute files into $STORAGE/h1/h2/$FILENAME (h1=user-Id, h2=pathname-based-hash) might be viable, if it's easy enough to introduce it to the file-managing software.


----------



## fluca1978 (Dec 27, 2011)

xibo said:
			
		

> However you need the directory contents to get more or less equally distributed. A double-hashing approach to distribute files into $STORAGE/h1/h2/$FILENAME (h1=user-Id, h2=pathname-based-hash) might be viable, if it's easy enough to introduce it to the file-managing software.



Agree.
After all it is always quite simple to split the file name into tokens (e.g., one letter at a time) so to have a poor-man balancing algorithm.


----------



## throAU (Jan 6, 2012)

davidkazuhiro said:
			
		

> What does "entries" mean here? Only subdirectories? Or everything including files, subdirectories, symbolic links and so forth?



Does it matter?

Do you realise just how many 256 trillion is?

I would wager you will most certainly run out of storage before you put that many files or subdirectories in a directory.


----------

