Discussion about backporting the O(1) Scheduler to the 2.4 stable kernel [earlier story] continues on the lkml. Ingo Molnar, the scheduler's author, maintains that much more testing is required:
"it might be a candidate for inclusion once it has _proven_ stability and robustness (in terms of tester and developer exposion), on the same order of magnitude as the 2.4 kernel - but that needs time and exposure in trees like the -ac tree and vendor trees. It might not happen at all, during the lifetime of 2.4."
Joe Sloan was among those who countered Ingo's comments:
"Ingo, it's apparent you are refraining from pushing this O(1) scheduler - that's admirable, but don't swing too far in the other direction. The fact is, it's working well in 2.5, it's working well in the 2.4-ac tree, it's working well in the 2.4-aa tree, and Red Hat has been shipping it."
Personally, I've been using and enjoying the O(1) scheduler on my desktop server since January when it became compatible with preemptible kernel patch [earlier story]. The following discussion makes for an interesting read, with good points made on both sides.
From: Rob Landley Subject: Re: [OKS] O(1) scheduler in 2.4 Date: Tue, 2 Jul 2002 21:11:31 -0400 On Monday 01 July 2002 10:48 pm, Tom Rini wrote: > I assume you mean 2.4.60 here, and no, I don't think O1 scheduler should > go into 2.4 ever. We're aiming for a _stable_ series here. Let me Ah, monday morning virtue, overcompensating for 2.4.10. It's the hangover speaking... "We upgrade our kernel on a production machine without testing it first, and we get mad if anything actually CHANGED. We want that upgrade to be a NOP, darn it! We want it to be as if we never did it in the first place, that's why we do it..." If you want stone tablet stability, why the heck are you upgrading your kernel? Downloading the new version off of kernel.org generally means you're mucking about with a working box, making changes that are not 100% required. If a security vulnerability comes out, you have the source and can patch the specific bug in your version. (If you're not up to that, you're probably using a vendor kernel, which is a whole 'nother can of worms.) If you install new hardware or software, and it going "boing" would be a bad thing, you try it on a scratch box first. If you don't, you deserve what you get. I'm under the impression 2.4.19 is introducing chunks of Andre Hedrick's new IDE code. So it's ok to upgrade something that can, in case of a bug, eat your data silently in a way that journaling won't detect. Why? LBA-48 and ATA-133, of course. But scheduling, which is SUPPOSED to be non-deterministic from day one and could theoretically be brain-dead round robin without affecting anything but performance... That's not safe to upgrade. Right. If you have a race condition in your code that a new scheduler triggers, ANYTHING could trigger it. 2.4.18 behaves horribly under load, try md5sum on an iso image and then pull up an xterm and su to another user. It can take 30 seconds. (Yeah, that's mostly IO starvation rather than the scheduler, but still, how is the new scheduler going to do WORSE than this?) The argument here is basically "don't change anything". It's not exactly a series then, is it? If you want trailing edge, 2.0 is still being maintained, let alone 2.2. Those have a great excuse for not accepting anything new beyond a really obvious bugfix. 2.4 does not, because 2.6 isn't out yet. Backporting of somethings from 2.5 to 2.4 will occur until then, and O(1) is an obvious eventual candidate. > stress that again, _stable_. I'd hope that 2.4.60 is as slow in coming > as 2.0.40 is. So the fact that it's in Alan Cox's kernel (meaning Red Hat is shipping it in 2.4.18-5.55, meaning that if more people aren't actually USING it yet than marcelo's 2.4, they will be soon), and andrea's kernel (meaning new VM development is being done with it in mind)... It may not be "sufficiently tested" yet but it's GETTING a lot of testing. You use anything EXCEPT a stock vanilla 2.4, you're probably getting O(1) at this point. If the vendors are starting to ship the thing already, what is the DOWN side to integrating it? The down side to NEVER integrating it is eventually fewer people using the kernel off of kernel.org. Does this remind anybody else of the 0.90 software raid stuff? At some point it makes more sense to keep the OLD one around as a patch for the 5% of the community that doesn't want to upgrade. We're not there on the scheduler yet, but "should not happen" without a qualifier means "never"... > > >c) I also suspect that it hasn't been as widley tested on !x86 as the > > >stuff currently in 2.4. And again, 2.4 is the stable tree. > > > > I know it is not a priority for 2.4, but say it wil never happen... > > I won't say it will never happen, just that I don't think it should. > It's a rather invasive thing (and as Ingo said, it's just not getting > stable). Ingo's main objection was that the patch is only 6 months old, and that 2.4 is only now stabilizing and that bug squeezing and smoothing should be given a little longer to ensure that people have the option of NOT upgrading, and that those upgrading want improvements rather than critical "this just doesn't work" fixes. And that's a fine argument. But 2.6 isn't going to be out this year. It's not even having its first freeze until October. Traditionally, we've been running a year and a half between stable releases (and another six months to actually get the new one battle-tested to where the distros and at least 50% of the production boxes upgrade.) We've got a year to eighteen months left on that cycle. Are the distros going to hold off adding it to 2.4 for a year to 18 months? The real question is, how much MORE conservative than the distros should the mainline kernels be? Rob From: Ingo Molnar Subject: Re: [OKS] O(1) scheduler in 2.4 Date: Wed, 3 Jul 2002 10:35:26 +0200 (CEST) On Tue, 2 Jul 2002, Rob Landley wrote: > If you want stone tablet stability, why the heck are you upgrading your > kernel? [...] to get security and stability fixes. > The argument here is basically "don't change anything". It's not > exactly a series then, is it? If you want trailing edge, 2.0 is still > being maintained, let alone 2.2. Those have a great excuse for not > accepting anything new beyond a really obvious bugfix. 2.4 does not, > because 2.6 isn't out yet. Backporting of somethings from 2.5 to 2.4 > will occur until then, and O(1) is an obvious eventual candidate. it might be a candidate for inclusion once it has _proven_ stability and robustness (in terms of tester and developer exposion), on the same order of magnitude as the 2.4 kernel - but that needs time and exposure in trees like the -ac tree and vendor trees. It might not happen at all, during the lifetime of 2.4. Note that the O(1) scheduler isnt a security or stability fix, neither is it a driver backport. It isnt a feature backport that enables hardware that couldnt be used in 2.4 before. The VM was a special case because most people agreed that it truly sucked, and even though people keep disagreeing about that decision, the VM is in a pretty good shape now - and we still have good correlation between the VM in 2.5, and the VM in 2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the people, so our hands are not forced in any way - we have the choice of a 'proven-rock-solid good scheduler' vs. an 'even better, but still young scheduler'. if say 90% of Linux users on the planet adopt the O(1) scheduler, and in a year or two there wont be a bigger distro (including Debian of course) without the O(1) scheduler in it [which, admittedly, is happening already], then it can and should perhaps be merged into 2.4. But right now i think that the majority of 2.4 users are running the stock 2.4 scheduler. > So the fact that it's in Alan Cox's kernel (meaning Red Hat is shipping > it in 2.4.18-5.55, meaning that if more people aren't actually USING it > yet than marcelo's 2.4, they will be soon), and andrea's kernel (meaning > new VM development is being done with it in mind)... It may not be > "sufficiently tested" yet but it's GETTING a lot of testing. You use > anything EXCEPT a stock vanilla 2.4, you're probably getting O(1) at > this point. things like migration to a new kernel happen on a slighly slower scale than the 6 months this patch has existed. I'd say in 1 year what you say might be true. 70% of the Linux users are not running the 'very latest' release. also note that the O(1) scheduler patch in the Red Hat kernel rpm was a stability fork done months ago, with stability fixes backported into it. The 2.4 O(1) patches being distributed now are more like direct backports of the 2.5 scheduler - this way we can get testing and feedback even from those people who do not want to (or cannot) run a 2.5 kernel due to the massive IO changes being underway. i do not say that the O(1) scheduler has bugs (if i knew about any i'd have fixed it already :), i am simply saying that to be able to say to Marcelo "it does not have bugs and does not introduce problems" it needs more exposure. [ And if the author of a given piece of code says things like this then it usually does not get merged ;-) ] > not there on the scheduler yet, but "should not happen" without a > qualifier means "never"... we agree here. > The real question is, how much MORE conservative than the distros should > the mainline kernels be? There's a natural 'feature race' between distros, so the distros can act as an additional (and pretty powerful) testing tool for various kernel features - and for which the distros are willing to spend resources and take risks as well. In fact they also act as a 'user demand' filter, for kernel features as well. And if all distros pick up a given feature, and it's been in for more than 6 months, (instead of 'more than 6 months since first patch') then Marcelo will have a much easier decision :-) Ingo From: Bill Davidsen Subject: Re: [OKS] O(1) scheduler in 2.4 Date: Wed, 3 Jul 2002 23:36:07 -0400 (EDT) > it might be a candidate for inclusion once it has _proven_ stability and > robustness (in terms of tester and developer exposion), on the same order > of magnitude as the 2.4 kernel - but that needs time and exposure in trees > like the -ac tree and vendor trees. It might not happen at all, during the > lifetime of 2.4. It has already proven to be stable and robust in the sense that it isn't worse than the stock scheduler on typical loads and is vastly better on some. > > Note that the O(1) scheduler isnt a security or stability fix, neither is > it a driver backport. It isnt a feature backport that enables hardware > that couldnt be used in 2.4 before. The VM was a special case because most > people agreed that it truly sucked, and even though people keep > disagreeing about that decision, the VM is in a pretty good shape now - > and we still have good correlation between the VM in 2.5, and the VM in > 2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the > people, so our hands are not forced in any way - we have the choice of a > 'proven-rock-solid good scheduler' vs. an 'even better, but still young > scheduler'. Here I disagree. Sure behaves like a stability fix to me. On a system with a mix of interractive and cpu-bound processes, including processes with hundreds of threads, you just can't get reasonable performance balancing with nice() because it is totally impractical to keep tuning a thread which changes from hog to disk io to socket waits with a human in the loop. The new scheduler notices this stuff and makes it work, I don't even know for sure (as in tried it) if you can have different nice on threads of the same process. This is not some neat feature to buy a few percent better this or that, this is roughly 50% more users on the server before it falls over, and no total bogs when many threads change to hog mode at once. You will not hear me saying this about preempt, or low-latency, and I bet that after I try lock-break this weekend I won't feel that I have to have that either. The O(1) scheduler is self defense against badly behaved processes, and the reason it should go in mainline is so it won't depend on someone finding the time to backport the fun stuff from 2.5 as a patch every time. -- bill davidsen [email blocked] CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. From: Ingo Molnar Subject: Re: [OKS] O(1) scheduler in 2.4 Date: Thu, 4 Jul 2002 08:56:01 +0200 (CEST) On Wed, 3 Jul 2002, Bill Davidsen wrote: > It has already proven to be stable and robust in the sense that it isn't > worse than the stock scheduler on typical loads and is vastly better on > some. this is your experience, and i'm happy about that. Whether it's the same experience for 90% of Linux users, time will tell. > Here I disagree. Sure behaves like a stability fix to me. On a system > with a mix of interractive and cpu-bound processes, including processes > with hundreds of threads, you just can't get reasonable performance > balancing with nice() because it is totally impractical to keep tuning a > thread which changes from hog to disk io to socket waits with a human in > the loop. The new scheduler notices this stuff and makes it work, I > don't even know for sure (as in tried it) if you can have different nice > on threads of the same process. (yes, it's possible to nice() individual threads.) > This is not some neat feature to buy a few percent better this or that, > this is roughly 50% more users on the server before it falls over, and > no total bogs when many threads change to hog mode at once. are these hard numbers? I havent seen much hard data yet from real-life servers using the O(1) scheduler. There was lots of feedback from desktop-class systems that behave better, but servers used to be pretty good with the previous scheduler as well. > You will not hear me saying this about preempt, or low-latency, and I > bet that after I try lock-break this weekend I won't fell that I have to > have that either. The O(1) scheduler is self defense against badly > behaved processes, and the reason it should go in mainline is so it > won't depend on someone finding the time to backport the fun stuff from > 2.5 as a patch every time. well, the O(1) scheduler indeed tries to put up as much defense against 'badly behaved' processes as possible. In fact you should try to start up your admin shells via nice -20, that gives much more priority than it used to under the previous scheduler - it's very close to the RT priorities, but without the risks. This works in the other direction as well: nice +19 has a much stronger meaning (in terms of preemption and timeslice distribution) than it used to. Ingo From: J Sloan Subject: Re: [OKS] O(1) scheduler in 2.4 Date: Thu, 04 Jul 2002 00:36:30 -0700 Ingo, it's apparent you are refraining from pushing this O(1) scheduler - that's admirable, but don't swing too far in the other direction. The fact is, it's working well in 2.5, it's working well in the 2.4-ac tree, it's working well in the 2.4-aa tree, and Red Hat has been shipping it. It will soon be the case that most Linux users are using O(1) - thus any poor clown who downloads the standard src from kernel.org has a large task ahead of him if he wants similar functionality to the majority of linux users. This divergence may not be a good thing... ;-) Joe
From: Robert Love Subject: [PATCH] O(1) scheduler for 2.4.19-rc1 Date: 02 Jul 2002 10:11:35 -0700 Available at ftp://ftp.kernel.org/pub/linux/kernel/people/rml/sched/ingo-O1/sched-O1-rml-2.4.19-rc1-1.patch and mirrors. Aside from the resync to 2.4.19-rc1, the following changes are new since the last release (most all pulled from 2.5):- reintroduce sync wake ups
- whitespace cleanup, trivial cleanups
- remove frozen lock and introduce new arch-specific
switch_mm() logic
- new rq_lock and rq_unlock methods
- wake_up optimization
- nr_uninterruptible optimization for count_active_tasks
- merge the task CPU affinity system calls
- sched_yield bugfix
- minor fixesCompiles on x86 UP and SMP.
Since Ingo recently posted 2.4-ac resyncs, I will refrain.
As I am the one doing these 2.4 patches, I will invariably be asked
whether I intend for the O(1) scheduler to be merged into 2.4. The
answer is a strong NO.Enjoy,
Robert Love
From: venom
Subject: Re: [PATCH] O(1) scheduler for 2.4.19-rc1
Date: Wed, 3 Jul 2002 00:10:21 +0200 (CEST)On 2 Jul 2002, Robert Love wrote:
> Since Ingo recently posted 2.4-ac resyncs, I will refrain.
>
> As I am the one doing these 2.4 patches, I will invariably be asked
> whether I intend for the O(1) scheduler to be merged into 2.4. The
> answer is a strong NO.Of course, I think you know that you will also asked WHY?
Also if I can immagine your reasons, as similar discussions have been
done for preemption patch and so on, and as I said at the times, I Agree.2.5 is the place for this new and cool stuff.
Luigi
From: Robert Love
Subject: Re: [PATCH] O(1) scheduler for 2.4.19-rc1
Date: 02 Jul 2002 15:18:00 -0700On Tue, 2002-07-02 at 15:10:
> Of course, I think you know that you will also asked WHY?Because I do not think 2.4 should be a breeding ground for every new
feature that wets someone's appetite. It should be stable and trusted
before anything else. We also have to worry about architecture
support. Let the scheduler be 2.5's thing.> Also if I can immagine your reasons, as similar discussions have been
> done for preemption patch and so on, and as I said at the times, I Agree.I do not think preemption should go in 2.4, either. It too is a 2.5
thing.> 2.5 is the place for this new and cool stuff.
Agreed.
Robert Love
MnhUaUcQExIrcIfDhqV
home insurance fstz
CMbSeBNOkusy
transamerica life insurance zzip mobile home insurance 99143
FtxWedJghYu
auto insurance quotes ifiqrt home insurance enuks