OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » zfs » discuss

Thread: ZFS Roadmap - thoughts on expanding raidz / restriping / defrag

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 3 - Last Post: Dec 18, 2007 2:54 AM by: paulz
myxiplx

Posts: 877
From: GB

Registered: 10/24/07
ZFS Roadmap - thoughts on expanding raidz / restriping / defrag
Posted: Dec 17, 2007 2:29 AM
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Hey folks,

Does anybody know if any of these are on the roadmap for ZFS, or have any idea how long it's likely to be before we see them (we're in no rush - late 2008 would be fine with us, but it would be nice to know they're being worked on)?

I've seen many people ask for the ability to expand a raid-z pool by adding devices. I'm wondering if it would be useful to work on a defrag / restriping tool to work hand in hand with this.

I'm assuming that when the functionality is available, adding a disk to a raid-z set will mean the existing data stays put, and new data is written across a wider stripe. That's great for performance for new data, but not so good for the existing files. Another problem is that you can't guarantee how much space will be added. That will have to be calculated based on how much data you already have.

ie: If you have a simple raid-z of five 500GB drives, you would expect adding another drive to add 500GB of space. However, if your pool is half full, you can only make use of 250GB of space, the other 250GB is going to be wasted.

What I would propose to solve this is to implement a defrag / restripe utility as part of the raid-z upgrade process, making it a three step process:

- New drive added to raid-z pool
- Defrag tool begins restriping and defragmenting old data
- Once restripe complete, pool reports the additional free space

There are some limitations to this. You would maybe want to advise that expanding a raid-z pool should only be done with a reasonable amount of free disk space, and that it may take some time. It may also be beneficial to add the ability to add multiple disks in one go.

However, if it works it would seem to add several benefits:
- Raid-z pools can be expanded
- ZFS gains a defrag tool
- ZFS gains a restriping tool

bonwick

Posts: 124
From:

Registered: 3/9/05
Re: ZFS Roadmap - thoughts on expanding raidz / restriping / defrag
Posted: Dec 17, 2007 2:42 AM   in response to: myxiplx

  Click to reply to this thread Reply

In short, yes. The enabling technology for all of this is something
we call bp rewrite -- that is, the ability to rewrite an existing
block pointer (bp) to a new location. Since ZFS is COW, this would
be trivial in the absence of snapshots -- just touch all the data.
But because a block may appear in many snapshots, there's more to it.
It's not impossible, just a bit tricky... and we're working on it.

Once we have bp rewrite, many cool features will become available as
trivial applications of it: on-line defrag, restripe, recompress, etc.

Jeff

On Mon, Dec 17, 2007 at 02:29:14AM -0800, Ross wrote:
> Hey folks,
>
> Does anybody know if any of these are on the roadmap for ZFS, or have any idea how long it's likely to be before we see them (we're in no rush - late 2008 would be fine with us, but it would be nice to know they're being worked on)?
>
> I've seen many people ask for the ability to expand a raid-z pool by adding devices. I'm wondering if it would be useful to work on a defrag / restriping tool to work hand in hand with this.
>
> I'm assuming that when the functionality is available, adding a disk to a raid-z set will mean the existing data stays put, and new data is written across a wider stripe. That's great for performance for new data, but not so good for the existing files. Another problem is that you can't guarantee how much space will be added. That will have to be calculated based on how much data you already have.
>
> ie: If you have a simple raid-z of five 500GB drives, you would expect adding another drive to add 500GB of space. However, if your pool is half full, you can only make use of 250GB of space, the other 250GB is going to be wasted.
>
> What I would propose to solve this is to implement a defrag / restripe utility as part of the raid-z upgrade process, making it a three step process:
>
> - New drive added to raid-z pool
> - Defrag tool begins restriping and defragmenting old data
> - Once restripe complete, pool reports the additional free space
>
> There are some limitations to this. You would maybe want to advise that expanding a raid-z pool should only be done with a reasonable amount of free disk space, and that it may take some time. It may also be beneficial to add the ability to add multiple disks in one go.
>
> However, if it works it would seem to add several benefits:
> - Raid-z pools can be expanded
> - ZFS gains a defrag tool
> - ZFS gains a restriping tool
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Robert Milkowski
rmilkowski@task.gda.pl
Re: ZFS Roadmap - thoughts on expanding raidz / restriping / defrag
Posted: Dec 17, 2007 5:01 AM   in response to: bonwick

  Click to reply to this thread Reply

Hello Jeff,

Monday, December 17, 2007, 10:42:18 AM, you wrote:

JB> In short, yes. The enabling technology for all of this is something
JB> we call bp rewrite -- that is, the ability to rewrite an existing
JB> block pointer (bp) to a new location. Since ZFS is COW, this would
JB> be trivial in the absence of snapshots -- just touch all the data.
JB> But because a block may appear in many snapshots, there's more to it.
JB> It's not impossible, just a bit tricky... and we're working on it.

JB> Once we have bp rewrite, many cool features will become available as
JB> trivial applications of it: on-line defrag, restripe, recompress, etc.



Cool.
Do you have some estimates on time frames? Last time it was said to be
late this year...

--
Best regards,
Robert Milkowski mailto:rmilkowski at task dot gda dot pl
http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


paulz

Posts: 27
From:

Registered: 6/15/05
Re: ZFS Roadmap - thoughts on expanding raidz / restriping / defrag
Posted: Dec 18, 2007 2:54 AM   in response to: bonwick

  Click to reply to this thread Reply


On 17 Dec 2007, at 11:42, Jeff Bonwick wrote:

> In short, yes. The enabling technology for all of this is something
> we call bp rewrite -- that is, the ability to rewrite an existing
> block pointer (bp) to a new location. Since ZFS is COW, this would
> be trivial in the absence of snapshots -- just touch all the data.
> But because a block may appear in many snapshots, there's more to it.
> It's not impossible, just a bit tricky... and we're working on it.
>
> Once we have bp rewrite, many cool features will become available as
> trivial applications of it: on-line defrag, restripe, recompress, etc.
>

Does that include evacuating vdevs ? Marking a vdev read only and
then doing a
rewrite pass would clear out the vdev, wouldn't it ?

Paul

> Jeff
>
> On Mon, Dec 17, 2007 at 02:29:14AM -0800, Ross wrote:
>> Hey folks,
>>
>> Does anybody know if any of these are on the roadmap for ZFS, or
>> have any idea how long it's likely to be before we see them (we're
>> in no rush - late 2008 would be fine with us, but it would be nice
>> to know they're being worked on)?
>>
>> I've seen many people ask for the ability to expand a raid-z pool
>> by adding devices. I'm wondering if it would be useful to work on
>> a defrag / restriping tool to work hand in hand with this.
>>
>> I'm assuming that when the functionality is available, adding a
>> disk to a raid-z set will mean the existing data stays put, and
>> new data is written across a wider stripe. That's great for
>> performance for new data, but not so good for the existing files.
>> Another problem is that you can't guarantee how much space will be
>> added. That will have to be calculated based on how much data you
>> already have.
>>
>> ie: If you have a simple raid-z of five 500GB drives, you would
>> expect adding another drive to add 500GB of space. However, if
>> your pool is half full, you can only make use of 250GB of space,
>> the other 250GB is going to be wasted.
>>
>> What I would propose to solve this is to implement a defrag /
>> restripe utility as part of the raid-z upgrade process, making it
>> a three step process:
>>
>> - New drive added to raid-z pool
>> - Defrag tool begins restriping and defragmenting old data
>> - Once restripe complete, pool reports the additional free space
>>
>> There are some limitations to this. You would maybe want to
>> advise that expanding a raid-z pool should only be done with a
>> reasonable amount of free disk space, and that it may take some
>> time. It may also be beneficial to add the ability to add
>> multiple disks in one go.
>>
>> However, if it works it would seem to add several benefits:
>> - Raid-z pools can be expanded
>> - ZFS gains a defrag tool
>> - ZFS gains a restriping tool
>>
>>
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris dot org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss





Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.