Nils Goroll
slink@schokola.de
|
|
|
|
[ha-clusters-discuss] fssnap
Posted:
Nov 3, 2009 1:18 PM
|
|
Hi All,
could anyone give me a brief summary of the technical reasons why UFS snapshots are not supported in cluster?
Some reasons I could imagine: - The PxFS layer is incompatible with UFS snapshots (does not sound likely?!?) - failover of the PxFS master conflicts with snapshots - issues with HAStoragePlus
Any of the above? Anything else?
A rough sketch of what would need to be done to support UFS snapshots would be ideal.
Thanks, Nils _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
Posts:
24
From:
DE
Registered:
10/29/07
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 3, 2009 1:22 PM
in response to: Nils Goroll
|
|
Hi Nils, the 2 main reasons I can remember: - fssnap takes time, maybe too much - which could lead to all sorts of problems - fssnap does not survive a failover; although I could have lived with this restriction.
And it has never been really tested, as far as I know. Regards hartmut
Nils Goroll schrieb: > Hi All, > > could anyone give me a brief summary of the technical reasons why UFS > snapshots are not supported in cluster? > > Some reasons I could imagine: > - The PxFS layer is incompatible with UFS snapshots (does not sound > likely?!?) > - failover of the PxFS master conflicts with snapshots > - issues with HAStoragePlus > > Any of the above? Anything else? > > A rough sketch of what would need to be done to support UFS snapshots > would be ideal. > > Thanks, Nils > _______________________________________________ > ha-clusters-discuss mailing list > ha-clusters-discuss at opensolaris dot org > http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
-- Sun Microsystems GmbH Hartmut Streppel Sonnenallee 1 Systems Practice D-85551 Kirchheim-Heimstetten Phone: +49 (0)89 46008 2563 Germany Mobile: +49 (0)172 8919711 http://www.sun.de FAX: +49 (0)89 46008 2572 mailto: hartmut dot streppel at sun dot com My BLOG: http://blogs.sun.com/Hartmut Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht München: HRB 161028 Geschäftsführer: Thomas Schröder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring
_______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Nils Goroll
slink@schokola.de
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 3, 2009 1:26 PM
in response to: hs86490
|
|
Hi Hartmut,
thank you for your quick reply.
> - fssnap takes time, maybe too much - which could lead to all sorts of > problems
OK, but it would seem possible to temporarily disable monitoring of all resources depending on the respective filesystem, right?
> - fssnap does not survive a failover; although I could have lived with > this restriction.
Same here.
Nils
_______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Posts:
24
From:
DE
Registered:
10/29/07
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 2:43 AM
in response to: Nils Goroll
|
|
Hi Nils, you have already received a much better technical explanation that mine. Still there is one more answer below
On 11/03/09 22:26, Nils Goroll wrote: > Hi Hartmut, > > thank you for your quick reply. > >> - fssnap takes time, maybe too much - which could lead to all sorts >> of problems > > OK, but it would seem possible to temporarily disable monitoring of > all resources depending on the respective filesystem, right? Yes, that should work. But people wanted to use it for the root filesystem. And I think fssnap would not work there. > >> - fssnap does not survive a failover; although I could have lived >> with this restriction. The problem is with the data and state of fssnap if a node should die during fssnap taking the snapshot.
Regards hartmut > > Same here. > > Nils >
-- Sun Microsystems GmbH Hartmut Streppel Sonnenallee 1 Systems Practice D-85551 Kirchheim-Heimstetten Phone: +49 (0)89 46008 2563 Germany Mobile: +49 (0)172 8919711 http://www.sun.de FAX: +49 (0)89 46008 2572 mailto: hartmut dot streppel at sun dot com My BLOG: http://blogs.sun.com/Hartmut SAP Infos: http://wikis.sun.com/display/SAPonSun/SAP+on+Sun Sitz der Gesellschaft Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht München: HRB 161028 Geschäftsführer: Thomas Schröder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring
_______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Posts:
51
From:
Menlo Park
Registered:
6/1/08
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 3, 2009 8:23 PM
in response to: Nils Goroll
|
|
On Wed, Nov 4, 2009 at 2:48 AM, Nils Goroll <slink at schokola dot de> wrote: > Hi All, > > could anyone give me a brief summary of the technical reasons why UFS > snapshots are not supported in cluster? > > Some reasons I could imagine: > - The PxFS layer is incompatible with UFS snapshots (does not sound > likely?!?) > - failover of the PxFS master conflicts with snapshots
fssnap needs to be run on the UFS mount point directly. It works closely with in-memory data structures of UFS. Thus if fssnap has to work with PxFS, PxFS client and server (master) needs to be able to understand fssnap ioctl. IIRC this support is not present in PxFS today.
As you know already, once a PxFS mount is done you don't have access to the on-disk UFS mount. The only access is through the client mounts. In addition fssnap needs local storage where the UFS file system is. Thus technically this is not a very simple feature to implement either.
cheers Binu
> - issues with HAStoragePlus > > Any of the above? Anything else? > > A rough sketch of what would need to be done to support UFS snapshots would > be ideal. > > Thanks, Nils > _______________________________________________ > ha-clusters-discuss mailing list > ha-clusters-discuss at opensolaris dot org > http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss > _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Nils Goroll
slink@schokola.de
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 3:06 AM
in response to: binujp
|
|
Hi Buni and Hartmut,
thank you both for your explanations.
Binu, the explanation you gave regarding PxFS is along the lines of what I'd expected.
Can we summarize the topic like this?
* To make fssnap work on PxFS ("global") mounts, several additional requirements regarding the PxFS layer and PxFS/UFS interoperation had to be fulfilled, so it seems to be far from trivial to implement fssnap on PxFS.
* For non-PxFS ("HA-local") mounts, fssnap should work technically, even though it is not supported. Care should be taken to minimize the impact of the necessary I/O pauses during the snapshot process, for instance by temporarily disabling cluster monitoring of resources depended upon the filesystem to be snapshotted.
I'll see where I get from here.
Hartmut, regarding your comment:
> The problem is with the data and state of fssnap if a node should die > during fssnap taking the snapshot.
Sure, UFS snapshots are temporary (see fssnap_ufs(1M)), but IIUC, snapshotting an ha-local FS in a cluster should not be any different from snapshotting any other (non-root) FS in that the (clustered or non clustered) node may die at any time, so the snapshot will get lost, right?
Again, thank you very much!
Nils
_______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Posts:
51
From:
Menlo Park
Registered:
6/1/08
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 3:26 AM
in response to: Nils Goroll
|
|
On Wed, Nov 4, 2009 at 4:36 PM, Nils Goroll <slink at schokola dot de> wrote: > Hi Buni and Hartmut, > > thank you both for your explanations. > > Binu, the explanation you gave regarding PxFS is along the lines of what I'd > expected. > > Can we summarize the topic like this? > > * To make fssnap work on PxFS ("global") mounts, several additional > requirements regarding the PxFS layer and PxFS/UFS interoperation had to be > fulfilled, so it seems to be far from trivial to implement fssnap on PxFS. > > * For non-PxFS ("HA-local") mounts, fssnap should work technically, even > though it is not supported. Care should be taken to minimize the impact of > the necessary I/O pauses during the snapshot process, for instance by > temporarily disabling cluster monitoring of resources depended upon the > filesystem to be snapshotted.
Even for non-HA mounts, the project complexity will be the same. The difference between HA and non-HA PxFS is in the server/master where the underlying UFS will be wrapped in a failover capable object or not. All client/server communication is identical. In both cases all fs activity should be stopped by the server while the snapshot is in progress.
If snapshotting is supported in some manner, the difference for non-HA PxFS server will be to not check whether a snapshot is in progress before allowing a switchover. Please keep in mind that I am not speaking from a proven as implementable idea, just guessing at what is possible.
cheers Binu
> I'll see where I get from here. > > Hartmut, regarding your comment: > >> The problem is with the data and state of fssnap if a node should die >> during fssnap taking the snapshot. > > Sure, UFS snapshots are temporary (see fssnap_ufs(1M)), but IIUC, > snapshotting an ha-local FS in a cluster should not be any different from > snapshotting any other (non-root) FS in that the (clustered or non > clustered) node may die at any time, so the snapshot will get lost, right? > > Again, thank you very much! > > Nils > > _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Nils Goroll
slink@schokola.de
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 5:47 AM
in response to: binujp
|
|
Hi Binu and Hartmut,
(Binu, sorry for the typo of your name in my last mail)
>> * For non-PxFS ("HA-local") mounts, fssnap should work technically, even >> though it is not supported. Care should be taken to minimize the impact of >> the necessary I/O pauses during the snapshot process, for instance by >> temporarily disabling cluster monitoring of resources depended upon the >> filesystem to be snapshotted. > > Even for non-HA mounts, the project complexity will be the same.
I hope I understand the difference between HA and non-HA PxFS, but I was referring to non-PxFS mounts on either node of the cluster, which I know by the name "ha-local".
My understanding is that, in this case, no PxFS is involved - correct?
Thank you anyway for your additional explanations regarding the two PxFS cases.
Hartmut,
> Sure, UFS snapshots are temporary (see fssnap_ufs(1M)), but IIUC, >> snapshotting an ha-local FS in a cluster should not be any different >> from snapshotting any other (non-root) FS in that the (clustered or >> non clustered) node may die at any time, so the snapshot will get >> lost, right?
> Correct! But there was a discussion along the lines that this is > unacceptable in an HA environment.
I think it probably will be for many HA-Applications, but some might live happily with that limitation, so, as most often, the best answer is probably "it depends". ;-)
Nils _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Posts:
51
From:
Menlo Park
Registered:
6/1/08
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 6:47 AM
in response to: Nils Goroll
|
|
On Wed, Nov 4, 2009 at 7:17 PM, Nils Goroll <slink at schokola dot de> wrote: > Hi Binu and Hartmut, > > (Binu, sorry for the typo of your name in my last mail) > >>> * For non-PxFS ("HA-local") mounts, fssnap should work technically, even >>> though it is not supported. Care should be taken to minimize the impact >>> of >>> the necessary I/O pauses during the snapshot process, for instance by >>> temporarily disabling cluster monitoring of resources depended upon the >>> filesystem to be snapshotted. >> >> Even for non-HA mounts, the project complexity will be the same. > > I hope I understand the difference between HA and non-HA PxFS, but I was > referring to non-PxFS mounts on either node of the cluster, which I know by > the name "ha-local".
Ooops, my bad, didn't read your reply properly.
> My understanding is that, in this case, no PxFS is involved - correct?
Your understanding is correct. For HASP controlled file systems there is no PxFS and it should be possible to use fssnap with judicious control of who is accessing the file system.
cheers Binu
> Thank you anyway for your additional explanations regarding the two PxFS > cases. > > Hartmut, > >> Sure, UFS snapshots are temporary (see fssnap_ufs(1M)), but IIUC, >>> snapshotting an ha-local FS in a cluster should not be any different >>> from snapshotting any other (non-root) FS in that the (clustered or >>> non clustered) node may die at any time, so the snapshot will get >>> lost, right? > >> Correct! But there was a discussion along the lines that this is >> unacceptable in an HA environment. > > I think it probably will be for many HA-Applications, but some might live > happily with that limitation, so, as most often, the best answer is probably > "it depends". ;-) > > Nils > _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Nils Goroll
slink@schokola.de
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 7:48 AM
in response to: binujp
|
|
Hi Binu,
thank you for your final clarification. I think we've got a pretty complete picture now.
Nils _______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
Posts:
24
From:
DE
Registered:
10/29/07
|
|
|
|
Re: [ha-clusters-discuss] fssnap
Posted:
Nov 4, 2009 4:19 AM
in response to: Nils Goroll
|
|
Hi Nils,
On 11/04/09 12:06, Nils Goroll wrote: > Hi Buni and Hartmut, > ... > Hartmut, regarding your comment: > > > The problem is with the data and state of fssnap if a node should die > > during fssnap taking the snapshot. > > Sure, UFS snapshots are temporary (see fssnap_ufs(1M)), but IIUC, > snapshotting an ha-local FS in a cluster should not be any different > from snapshotting any other (non-root) FS in that the (clustered or > non clustered) node may die at any time, so the snapshot will get > lost, right? Correct! But there was a discussion along the lines that this is unacceptable in an HA environment. Not me btw.
Regards Hartmut
> > Again, thank you very much! > > Nils >
-- Sun Microsystems GmbH Hartmut Streppel Sonnenallee 1 Systems Practice D-85551 Kirchheim-Heimstetten Phone: +49 (0)89 46008 2563 Germany Mobile: +49 (0)172 8919711 http://www.sun.de FAX: +49 (0)89 46008 2572 mailto: hartmut dot streppel at sun dot com My BLOG: http://blogs.sun.com/Hartmut SAP Infos: http://wikis.sun.com/display/SAPonSun/SAP+on+Sun Sitz der Gesellschaft Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht München: HRB 161028 Geschäftsführer: Thomas Schröder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring
_______________________________________________ ha-clusters-discuss mailing list ha-clusters-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
|
|
|
|
|