Re: [Rt2400-devel] 2.6.25-rc2 regression in rt61pci wireless driver

Previous thread: git-xtensa doesn't like my cross compiler by Adrian Bunk on Saturday, February 16, 2008 - 3:46 am. (3 messages)

Next thread: [RFC] bitmap onto and fold operators for mempolicy extensions by Paul Jackson on Saturday, February 16, 2008 - 5:33 am. (7 messages)
From: Chris Clayton
Date: Saturday, February 16, 2008 - 5:06 am

Hi,

Firstly, please cc me into any any replies - I'm not subscribed.

In 2.6.25 kernels, my wireless LAN dies after even the smallest amount
of network activity. The following screen cut shows what I typically
see:

[chris:~]$ uname -a
Linux laptop 2.6.25-rc2 #10 PREEMPT Sat Feb 16 09:53:04 UTC 2008 i686
GNU/Linux
[chris:~]$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) from 192.168.1.30 : 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=0 ttl=255 time=9.837 msec
64 bytes from 192.168.1.1: icmp_seq=1 ttl=255 time=3.148 msec
64 bytes from 192.168.1.1: icmp_seq=2 ttl=255 time=2.205 msec
^C
--- 192.168.1.1 ping statistics ---
9 packets transmitted, 3 packets received, 66% packet loss
round-trip min/avg/max/mdev = 2.205/5.063/9.837/3.397 ms
[chris:~]$ dmesg | tail
NET: Unregistered protocol family 23
parport_pc 00:0f: disabled
ACPI: PCI interrupt for device 0000:02:00.0 disabled
ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) ->
IRQ 11
[drm] Initialized radeon 1.28.0 20060524 on minor 0
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 2x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 2x mode
[drm] Setting GART location based on old memory map
[drm] writeback test succeeded in 1 usecs
[chris:~]$

As you can see, after a few packets, the ping application hangs and
after that point, all network accesses fail. There are no error
messages written to the logs when the network dies. I can restart the
network by simply unloading and reloading the driver. This hang does
not occur with a different wireless card that uses the rtl8180 driver.

My config and the output from dmesg (after reloading the driver) are
attached. I have Ralink debugfs enabled so can provide any additional
diagnostics that may be helpful from that source.

(I should perhaps add that I am sure that this is not the same problem
I reported tthrough bugzilla (bug 9860) a few weeks ago, which it
seems ...
From: Ivo van Doorn
Date: Monday, February 18, 2008 - 11:11 am

How complete is this failure? Just TX or also RX?

Could you use the tools found here:
http://www-user.rhrk.uni-kl.de/~nissler/rt2x00/index.html

and capture all TX/RX frames going through the hardware?
Note that after the failure, this dumping facitilty should still report

I have a series of tests I would like to request from you,
you mentioned you already enabled debugfs, and that is just what we need. ;)
Please use attached script to create dumps of the hardware register contents.

There are specific moments that should be dumped:
- kernel 2.6.24 (last known working version for you).
- kernel 2.6.25-rc2 (after ifup, before TX dies)

Above traces should be enough, but to determine where rt2x00 broke
down approximatly I need to have a few test result on specific moments.
Could you test the kernel with the following versions:

rt2x00 2.0.11	2d68de3efa62655d551092f5c787505735d561ad
rt2x00 2.0.12	a3c7aa58df7df80aa05f166fe3e42482247164cf
rt2x00 2.0.13	5a6012e105ae1664cd2841c33bf59fbdd8d4dbcc

Checking those out is simply a matter of:
git branch 2.0.11 2d68de3efa62655d551092f5c787505735d561ad
git checkout 2.0.11

No further bisecting is needed, but with above tests I can at least
narrow it down to find the cause of this issue.

Thanks.

Ivo
From: Ivo van Doorn
Date: Monday, February 18, 2008 - 11:16 am

The debugfs register files were moved into a seperate folder somewhere between
2.6.24 and 2.6.25. This means you might have to edit the file slightly to make it
point to the correct location of the chipset file.

Ivo
--

From: Chris Clayton
Date: Monday, February 18, 2008 - 3:51 pm

Hi Ivo,


It will be tomorrow before I can provide this because I'm struggling to get
wireshark to build against the old (2.4.x) kernel headers that match glibc on
my laptop. I'll build it on my desktop, which has more recent headers, and




-- 
Beauty is in the eye of the beerholder.
From: Ivo van Doorn
Date: Tuesday, February 19, 2008 - 2:26 am

Thanks. I think I found something, please test below patch:

---
diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
index 015738a..8df1991 100644
--- a/drivers/net/wireless/rt2x00/rt2x00dev.c
+++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
@@ -249,10 +249,10 @@ static void rt2x00lib_evaluate_antenna(struct rt2x00_dev *rt2x00dev)
 	rt2x00dev->link.ant.flags &= ~ANTENNA_TX_DIVERSITY;
 
 	if (rt2x00dev->hw->conf.antenna_sel_rx == 0 &&
-	    rt2x00dev->default_ant.rx != ANTENNA_SW_DIVERSITY)
+	    rt2x00dev->default_ant.rx == ANTENNA_SW_DIVERSITY)
 		rt2x00dev->link.ant.flags |= ANTENNA_RX_DIVERSITY;
 	if (rt2x00dev->hw->conf.antenna_sel_tx == 0 &&
-	    rt2x00dev->default_ant.tx != ANTENNA_SW_DIVERSITY)
+	    rt2x00dev->default_ant.tx == ANTENNA_SW_DIVERSITY)
 		rt2x00dev->link.ant.flags |= ANTENNA_TX_DIVERSITY;
 
 	if (!(rt2x00dev->link.ant.flags & ANTENNA_RX_DIVERSITY) &&



--

From: Chris Clayton
Date: Tuesday, February 19, 2008 - 12:00 pm

Hi,


I've tried the patch but, unfortunately, my wireless LAN still dies after a few pings.

The frame dump diagnostics you asked for are attached. This is a fresh dump taken
tonight running the driver with your patch applied.


Chris
-- 
Beauty is in the eye of the beerholder.
From: Ivo van Doorn
Date: Tuesday, February 19, 2008 - 12:46 pm

Hi,


Could you use below patch instead, and make a new dump of the register?
I'm still convinced the breakage occurs in the antenna diversity (or rather, I believe

Thanks, I think I miss some information in that dump,
but that is okay for now.

Ivo

---

diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
index 015738a..65a512f 100644
--- a/drivers/net/wireless/rt2x00/rt2x00dev.c
+++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
@@ -223,7 +223,7 @@ static void rt2x00lib_evaluate_antenna_eval(struct rt2x00_dev *rt2x00dev)
 	 * sample the rssi from the other antenna to make a valid
 	 * comparison between the 2 antennas.
 	 */
-	if ((rssi_curr - rssi_old) > -5 || (rssi_curr - rssi_old) < 5)
+	if (abs(rssi_curr - rssi_old) < 5)
 		return;
 
 	rt2x00dev->link.ant.flags |= ANTENNA_MODE_SAMPLE;
@@ -249,10 +249,10 @@ static void rt2x00lib_evaluate_antenna(struct rt2x00_dev *rt2x00dev)
 	rt2x00dev->link.ant.flags &= ~ANTENNA_TX_DIVERSITY;
 
 	if (rt2x00dev->hw->conf.antenna_sel_rx == 0 &&
-	    rt2x00dev->default_ant.rx != ANTENNA_SW_DIVERSITY)
+	    rt2x00dev->default_ant.rx == ANTENNA_SW_DIVERSITY)
 		rt2x00dev->link.ant.flags |= ANTENNA_RX_DIVERSITY;
 	if (rt2x00dev->hw->conf.antenna_sel_tx == 0 &&
-	    rt2x00dev->default_ant.tx != ANTENNA_SW_DIVERSITY)
+	    rt2x00dev->default_ant.tx == ANTENNA_SW_DIVERSITY)
 		rt2x00dev->link.ant.flags |= ANTENNA_TX_DIVERSITY;
 
 	if (!(rt2x00dev->link.ant.flags & ANTENNA_RX_DIVERSITY) &&

--

From: Chris Clayton
Date: Tuesday, February 19, 2008 - 1:44 pm

Hi,


Sorry, I've applied that patch and the LAN still dies after a few pings. BTW,
this and the earlier patch both apply without error, but give warnings of 70
line offsets. Were you expecting them to apply completely cleanly? I'm just
wondering if there might be some code that you are expecting to be running (or
not running) that is (or is not) present in the driver at 2.6.25-rc2.

The register dumps before and after are attached.

Thanks,




-- 
Beauty is in the eye of the beerholder.
From: Ivo van Doorn
Date: Tuesday, February 19, 2008 - 2:03 pm

Well to be honest I based the patch on rt2x00.git and not 2.6.25-rc2.
I know the patch would apply safely because the function that were changed
in that patch haven't changed between them. But some other functions were

Thanks. I hope to have a new patch ready soon.

Ivo

--

From: Chris Vine
Date: Tuesday, February 19, 2008 - 4:04 pm

rt2x00 2.0.14 is broken with my rt73 stick in the vanilla 2.6.25-rc2
kernel (not wireless-2.6/rt2x00 git).  The modules load when I plug the
stick in but I then get a complete kernel lock up with two flashing
leds.  Nothing is recorded to system logs.  The last logged messages are
that usbcore has registered new interface driver rt73usb, and that the
rate control algorithm has been selected on phy0.  This happens whether
the simple or pid mac80211 rate control algorithms have been chosen.

This is a shame because 2.0.14 was working really well for me until the
mac80211 changes 2 or 3 weeks ago broke it.  (Shortly followed by the
release of 2.1.*).

Chris



--

From: Dan Williams
Date: Wednesday, February 20, 2008 - 9:05 am

Switch to a VT with Ctl+Alt+1, then plug the stick in, and take a
picture of the panic if one shows up.  _Something_ should show up on the
VT.

Dan


--

From: Chris Vine
Date: Wednesday, February 20, 2008 - 1:27 pm

I did that yesterday and it just reported a kernel panic on the terminal
with the message:

  Kernel panic - not syncing: Aiee, killing interrupt handler!

There is a complete lock up.  Even the two leds don't send a dump in
morse code (if that is still a feature of the 2.6 kernels).  They just
flash together at 1 second intervals.

However, I do not have debugging enabled on 2.6.25-rc2 (I was just
interested to see how it worked).  If it is thought to be useful I can
recompile the kernel with debugging enabled, but this should be
reproducible by anyone with a rt73 stick.

By way of a further data point, I can scan OK using 2.6.25-rc2 and it
will report all the available access points in my area.  But as soon as
association is attempted, it blows up.

Chris


--

From: Ivo van Doorn
Date: Wednesday, February 20, 2008 - 1:50 pm

I have an idea, could you try below patch?
Note that while applying it will mention something about a line offset, but that can be ignored.

This could perhaps also fix the TX/RX issue mentioned earlier in the thread, but I am not
quite sure about that.

---
diff --git a/drivers/net/wireless/rt2x00/rt2400pci.c b/drivers/net/wireless/rt2x00/rt2400pci.c
index b63bc66..460ef2f 100644
--- a/drivers/net/wireless/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/rt2x00/rt2400pci.c
@@ -953,8 +953,12 @@ static int rt2400pci_set_device_state(struct rt2x00_dev *rt2x00dev,
 		rt2400pci_disable_radio(rt2x00dev);
 		break;
 	case STATE_RADIO_RX_ON:
+	case STATE_RADIO_RX_ON_LINK:
+		rt2400pci_toggle_rx(rt2x00dev, STATE_RADIO_RX_ON);
+		break;
 	case STATE_RADIO_RX_OFF:
-		rt2400pci_toggle_rx(rt2x00dev, state);
+	case STATE_RADIO_RX_OFF_LINK:
+		rt2400pci_toggle_rx(rt2x00dev, STATE_RADIO_RX_OFF);
 		break;
 	case STATE_DEEP_SLEEP:
 	case STATE_SLEEP:
diff --git a/drivers/net/wireless/rt2x00/rt2500pci.c b/drivers/net/wireless/rt2x00/rt2500pci.c
index add8aff..ffcd996 100644
--- a/drivers/net/wireless/rt2x00/rt2500pci.c
+++ b/drivers/net/wireless/rt2x00/rt2500pci.c
@@ -1106,8 +1106,12 @@ static int rt2500pci_set_device_state(struct rt2x00_dev *rt2x00dev,
 		rt2500pci_disable_radio(rt2x00dev);
 		break;
 	case STATE_RADIO_RX_ON:
+	case STATE_RADIO_RX_ON_LINK:
+		rt2500pci_toggle_rx(rt2x00dev, STATE_RADIO_RX_ON);
+		break;
 	case STATE_RADIO_RX_OFF:
-		rt2500pci_toggle_rx(rt2x00dev, state);
+	case STATE_RADIO_RX_OFF_LINK:
+		rt2500pci_toggle_rx(rt2x00dev, STATE_RADIO_RX_OFF);
 		break;
 	case STATE_DEEP_SLEEP:
 	case STATE_SLEEP:
diff --git a/drivers/net/wireless/rt2x00/rt2500usb.c b/drivers/net/wireless/rt2x00/rt2500usb.c
index d9643c5..9f59db9 100644
--- a/drivers/net/wireless/rt2x00/rt2500usb.c
+++ b/drivers/net/wireless/rt2x00/rt2500usb.c
@@ -996,8 +996,12 @@ static int rt2500usb_set_device_state(struct rt2x00_dev *rt2x00dev,
 		rt2500usb_disable_radio(rt2x00dev);
 		break;
 ...
From: Chris Vine
Date: Wednesday, February 20, 2008 - 2:16 pm

On Wed, 2008-02-20 at 21:50 +0100, Ivo van Doorn wrote:

The patch applied OK (with some offsets as you say) but it doesn't help.
The kernel panic still occurs when association is attempted.

Chris


--

From: Chris Vine
Date: Thursday, February 21, 2008 - 2:07 pm

Here's some further information.

I have a fully functioning version of rt2x00-2.0.14 and mac80211 from
wireless-2.6/compat-wireless-2.6 of mid January which works fine on
kernel 2.6.24.  On doing a comparison with the rt2x00 in vanilla kernel
2.6.25-rc2, there are no material differences.  (There was a slight
change in the declaration a variable in rt2x00usb.c but it is
immaterial.)

I compiled up the working mid-January version of rt2x00 and mac80211
under kernel 2.6.25-rc2 and I get exactly the same result as I reported
earlier, namely I get a kernel panic as soon as I try to associate.  It
looks therefore as if something has changed within the remainder of the
kernel which has caused rt2x00 (and possibly mac80211?) to break.

This probably explains the problem another user reported with rt61.

Chris


--

From: Ivo van Doorn
Date: Thursday, February 21, 2008 - 2:51 pm

Perhaps something similar like:
http://bugzilla.kernel.org/show_bug.cgi?id=10058
in there a reference is made to the following patch:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc2/2.6.25-rc2-mm...

Does applying that help?

Ivo
--

From: Chris Vine
Date: Thursday, February 21, 2008 - 3:46 pm

Yes, well done.

I have spent 20 minutes testing it and it seems to work fine (at least
as well as 2.0.14 does under kernel 2.6.24).  The rate control algorithm
seems to work better as well, but that is probably a mac80211 thing.

Chris


--

From: Ivo van Doorn
Date: Thursday, February 21, 2008 - 3:51 pm

Excellent, I am currently updating rt2x00.git from wireless-testing to get
the above mentioned patch into the repository.

Ivo

--

From: Chris Clayton
Date: Thursday, February 21, 2008 - 4:04 pm

I'm afraid not, Ivo. The test I ran last night was against 2.6.25.-rc2-git4 and
that already has this patch applied. Furthermore, I have another card that uses
the rtl8180 driver and that works reliably. I, therefore, suspect that my problem
lies within the rt61pci driver or the rt2x00 infrastructure.




-- 
Beauty is in the eye of the beerholder.
--

From: Chris Vine
Date: Thursday, February 21, 2008 - 4:20 pm

Does the same happen with 2.0.14 under kernel 2.6.24?

Chris



--

From: Chris Clayton
Date: Friday, February 22, 2008 - 12:39 am

Unfortunately, a 2.6.24.2 tree with the drivers/net/wireless/rt2x00 directory replaced with that from 2.6.25-rc2-git4 doesn't build:

In file included from drivers/net/wireless/rt2x00/rt2x00dev.c:29:
drivers/net/wireless/rt2x00/rt2x00.h:942: warning: `struct ieee80211_bss_conf' declared inside parameter list
drivers/net/wireless/rt2x00/rt2x00.h:942: warning: its scope is only this definition or declaration, which is probably not what you want

[...]

drivers/net/wireless/rt2x00/rt2x00dev.c: In function `rt2x00lib_configuration_scheduled':
drivers/net/wireless/rt2x00/rt2x00dev.c:484: error: storage size of `bss_conf' isn't known
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: `BSS_CHANGED_ERP_PREAMBLE' undeclared (first use in this function)
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: (Each undeclared identifier is reported only once
drivers/net/wireless/rt2x00/rt2x00dev.c:494: error: for each function it appears in.)
drivers/net/wireless/rt2x00/rt2x00dev.c:484: warning: unused variable `bss_conf'
drivers/net/wireless/rt2x00/rt2x00dev.c: In function `rt2x00lib_beacondone_scheduled':

-- 
Beauty is in the eye of the beerholder.
--

From: Ivo van Doorn
Date: Friday, February 22, 2008 - 12:11 pm

You need the mac80211 compat  module from Intel. That allows new mac80211 versions to run
on older kernels. When you use that you need to grab the rt2x00 cvs tarball from:
http://rt2x00.serialmonkey.com/wiki/index.php/Downloads


Ivo
--

From: Chris Vine
Date: Friday, February 22, 2008 - 1:33 pm

You have to have a version of mac80211 which is current with the version
of rt2x00 2.0.14.  If you don't want to go back into wireless-2.6 git to
do that I can send you my known working copy (for rt73 and I hope for
rt61) of compat-wireless-2.6 which will compile under 2.6.24.  However
it is just over 1MB in size so I won't sent it to you unless you would
like me to do that instead of you pulling git from wireless-2.6 for mid
January (actually the most recent working version would be immediately
before the patch which raised the version of rt2x00 to 2.1.0).

Chris


--

From: Chris Clayton
Date: Wednesday, February 20, 2008 - 3:13 pm

Hi Ivo,


Sorry, but again a few pings and the network fails. I've attached the before and
after register dumps. This is with your patch applied against 2.6.25-rc2-git4.

Chris

-- 
Beauty is in the eye of the beerholder.
From: Chris Clayton
Date: Friday, February 22, 2008 - 8:46 am

Hi Ivo,



OK, we seem to be struggling a little, so I've built an installed git
and cloned Linus' 2.6 tree. My wireless network dies after a few pings

If you need me to bisect, just shout. Please be patient though, I'm
exploring new territory here :-)

Thanks



-- 
Beauty is in the eye of the beerholder.
--

From: Ivo van Doorn
Date: Friday, February 22, 2008 - 12:47 pm

I don't think bisecting this will help a lot, the rt2x00 2.0.11 release
introduced software diversity. And that is already something I suspect
of being broken.

Unfortunately software diversity was a bugfix and fix in one,
the previous setup was broken for some hardware since the
lack of software diversity caused problems.

Could you check if below patch helps in any way?

Ivo

---

diff --git a/drivers/net/wireless/rt2x00/rt2x00config.c b/drivers/net/wireless/rt2x00/rt2x00config.c
index a1d8e33..6995912 100644
--- a/drivers/net/wireless/rt2x00/rt2x00config.c
+++ b/drivers/net/wireless/rt2x00/rt2x00config.c
@@ -122,6 +122,10 @@ void rt2x00lib_config_antenna(struct rt2x00_dev *rt2x00dev,
 	libconf.ant.rx = rx;
 	libconf.ant.tx = tx;
 
+	if (rx == rt2x00dev->link.ant.active.rx &&
+	    tx == rt2x00dev->link.ant.active.tx)
+		return;
+
 	/*
 	 * Antenna setup changes require the RX to be disabled,
 	 * else the changes will be ignored by the device.
diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c b/drivers/net/wireless/rt2x00/rt2x00dev.c
index 65a512f..4325c08 100644
--- a/drivers/net/wireless/rt2x00/rt2x00dev.c
+++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
@@ -191,16 +191,16 @@ static void rt2x00lib_evaluate_antenna_sample(struct rt2x00_dev *rt2x00dev)
 		return;
 
 	if (rt2x00dev->link.ant.flags & ANTENNA_RX_DIVERSITY) {
-		if (sample_a > sample_b && rx == ANTENNA_B)
+		if (sample_a > sample_b)
 			rx = ANTENNA_A;
-		else if (rx == ANTENNA_A)
+		else
 			rx = ANTENNA_B;
 	}
 
 	if (rt2x00dev->link.ant.flags & ANTENNA_TX_DIVERSITY) {
-		if (sample_a > sample_b && tx == ANTENNA_B)
+		if (sample_a > sample_b)
 			tx = ANTENNA_A;
-		else if (tx == ANTENNA_A)
+		else
 			tx = ANTENNA_B;
 	}
 
@@ -257,7 +257,7 @@ static void rt2x00lib_evaluate_antenna(struct rt2x00_dev *rt2x00dev)
 
 	if (!(rt2x00dev->link.ant.flags & ANTENNA_RX_DIVERSITY) &&
 	    !(rt2x00dev->link.ant.flags & ANTENNA_TX_DIVERSITY)) {
-		rt2x00dev->link.ant.flags &= ...
From: Chris Clayton
Date: Monday, February 25, 2008 - 2:04 pm

Hi,

Firstly apologies for trimming linux-kernel and linux-wireless from my reply 
to Ivo yesterday. Basically, I replied saying that the patch below didn't fix 
the problem. But please do read on...

 I've bisected anyway and although the results are not absolutely conclusive, 
as I neared the end of the process, I was amongst a bunch of mac80211 
patches. This set me on a path that resulted in me discovering that with the 
rt61pci driver, I can freeze my wireless network connection almost at will if 
I set mac82011's ieee80211_default_rc_algo parameter to 'pid'. if the 
parametre is set to 'simple', the network seems to be reliable. I've just let 
the ping application run on and ping another box on my network almost 1500 
times whilst repeatedly transferring a kernel source tarball by ftp from 
another box and the network connection was mantained That's with the 
parameter set to 'simple', if \I set it to 'pid' the connection rarely 
survives more than 40 pings even without the ftp activity.

If I replace my wireless card with one that uses the rtl8180 driver, the 
network connection seems to be reliable regardless of how I set the 
parameter, although I admit that i have not tested this extensively yet. I'll 
do that now and report later.

Hope this helps.




-- 
Beauty is in the eye of the beerholder.
--

From: Ivo Van Doorn
Date: Monday, February 25, 2008 - 3:09 pm

I'm about to send 4 patches to this (linux-wireless) list with patches
for rt2x00,
most of them you already tested individually, but several people reported
success after those patches.

Hopefully it will be working for you as well. :)

Ivo
--

From: Chris Clayton
Date: Tuesday, February 26, 2008 - 12:11 pm

I've rerun my tests with the rtl8180 driver and found the network to be 
reliable with the mac82011 module's ieee80211_default_rc_algo parameter set 

Sorry, but that's not the case. I find the same results as without the 
patches. With the parameter set to 'pid', the network connection fails very 
quickly, but with it set to 'simple' I can ping and ftp files to and from my 
laptop as much as I like and the connection stays up. In fact, if anything 
the patches seem to have made the network even more fragile, in that it fails 
almost instantly once I start some network activity ( < 10 pings).

I'm sure this is not the hardware - it works perfectly with Windows XP, with 
2.6.23.14 plus the out-of-tree rt61 driver from serialmonkey, with the 
in-tree driver from 2.6.24.x and with 2.6.25-rc3 with the mac82011's 
ieee80211_default_rc_algo parameter set to 'simple'.

Like I say above, sorry!


-- 
Beauty is in the eye of the beerholder.
--

From: John W. Linville
Date: Tuesday, February 26, 2008 - 12:48 pm

At last!  Vindication for insisting that we keep 'simple' around!
Bwahahaha! :-)

So, am I to understand that 'pid' works find for you with rtl8180?
If so, then I wonder if Stefano and Ivo can help us figure-out
what kind of problem is sensitive to both driver _and_ rate control
algorithm?

Thanks,

John
-- 
John W. Linville
linville@tuxdriver.com
--

From: Ivo Van Doorn
Date: Tuesday, February 26, 2008 - 1:30 pm

rt2x00 is known to be less sensitive then the legacy drivers, scanning
produces less and more inconsistent results (Not all AP's are reported,
even when that AP has a high rssi), and the reported RSSI is often
much lower then expected with the distance to the AP.
I have compared many register dumps, but have never managed to
find a real register setting that might cause this. So what might be
the problem is that rt2x00 is not reporting the RSSI correctly to mac80211.

With the big difference between how mac80211 handles TX rates and
how the legacy drivers handle them, it is hard to make a comparison
where exactly things are going wrong. But in the end, I think it all
comes down to rt2x00 reporting invalid RSSI values to mac80211,
and/or the rate control mechanism being too dependent on some
statistics which are not provided by the driver.

I have to admit that I haven't looked into the 'pid' algorithm closely,
but could it be that some fields in the tx status report upon txdone
are being treated as "very important" while the driver doesn't report it
(For example ack signal strength)?

Other then that I have to say that rt2x00 never has reached a particular
state where link quality issues can be traced back to mac80211 or
the rate control mechanism. It usually was caused by a bug in the driver
itself. (rt2x00 cannot be considered stable yet for a very good reason. ;) )

Ivo
--

From: Stefano Brivio
Date: Tuesday, February 26, 2008 - 2:44 pm

On Tue, 26 Feb 2008 21:30:38 +0100


The only important thing drivers should report back to mac80211 are ACKed
frames. In rc80211-pid (and it's just the same in rc80211-simple) the only
inputs from mac80211 are succesfully (re)transmitted frames and failed
frames.


-- 
Ciao
Stefano
--

From: Chris Clayton
Date: Tuesday, February 26, 2008 - 2:13 pm

Yes John, using the rtl8180 driver I get reliable network performance
with either 'pid' or 'simple'. With the rt61pci driver, I find that
'simple' provides a reliable network, but 'pid' simply does not work.


-- 
Beauty is in the eye of the beerholder.
--

From: Stefano Brivio
Date: Tuesday, February 26, 2008 - 2:38 pm

On Tue, 26 Feb 2008 21:13:48 +0000

Please, could you mount debugfs and provide me with a dump of this file:
/debug/ieee80211/phy*/stations/*/rc_pid_events

Thank you.


--
Ciao
Stefano
--

From: Chris Clayton
Date: Tuesday, February 26, 2008 - 3:36 pm

Here's a dump that I started, then began pinging my gateway in another
terminal until the network failed and then stopped the dump with ^C a
few seconds later. Hope it helps.

[chris:~]$ cat /debug/ieee80211/phy0/stations/00\:60\:b3\:77\:73\:1a/rc_pid_events
3 131904 tx_status 0 0
4 131904 pf_sample 0 3584 840 0
5 134212 tx_rate 0 10
6 134212 tx_status 0 0
7 134212 pf_sample 0 3584 1183 0
8 134212 tx_rate 0 10
9 134213 tx_status 0 1
10 134462 tx_rate 0 10
11 134462 tx_status 0 0
12 134462 pf_sample 8448 -4864 427 8448
13 134713 tx_rate 0 10
14 134713 tx_status 0 0
15 134713 pf_sample 0 3584 821 -8448
16 134713 rate_change 0 10
17 134964 tx_rate 0 10
18 134964 tx_status 0 0
19 134964 pf_sample 0 3584 1167 0
20 135215 tx_rate 0 10
21 135215 tx_status 0 0
22 135215 pf_sample 0 3584 1469 0
23 135215 rate_change 1 20
24 135466 tx_rate 1 20
25 135466 tx_status 0 0
26 135466 pf_sample 0 3584 1733 0
27 135466 rate_change 2 55
28 135717 tx_rate 2 55
29 135717 tx_status 0 0
30 135717 pf_sample 0 3584 1965 0
31 135717 rate_change 129 -541505508
32 135968 tx_rate 11 540
33 136219 tx_rate 11 540
34 136470 tx_rate 11 540
35 136721 tx_rate 11 540
36 136972 tx_rate 11 540
37 137223 tx_rate 11 540
38 137474 tx_rate 11 540
39 137725 tx_rate 11 540
40 137976 tx_rate 11 540
41 138227 tx_rate 11 540
^C



-- 
Beauty is in the eye of the beerholder.
--

From: Stefano Brivio
Date: Wednesday, February 27, 2008 - 12:26 am

On Tue, 26 Feb 2008 22:36:19 +0000

Known and fixed. The fix isn't in 2.6.25-rc3 yet, though.

Fix:
commit 32720eae675d08990e97bffbf71a31382599cc8a
Author: Stefano Brivio <stefano.brivio@polimi.it>
Date:   Tue Jan 29 20:29:16 2008 +0100

    rc80211-pid: fix rate adjustment


--
Ciao
Stefano
--

From: John W. Linville
Date: Wednesday, February 27, 2008 - 8:51 am

And it currently isn't queued for 2.6.25 at all.

Chris, can you cherry-pick this from the wireless-2.6.26 tree and
give it a test on your 2.6.25-rc3 tree?  If it resolves a problem
then I'll queue it to Dave for 2.6.25 (which will probably provoke
a net-2.6.26 and wireless-2.6.26 rebase).

Let me know...

John
-- 
John W. Linville
linville@tuxdriver.com
--

From: John W. Linville
Date: Wednesday, February 27, 2008 - 10:25 am

That might be easier said than done.  It looks like that patch depends
on the big cfg80211 API change queued for 2.6.26.

Stefano offered to rebase that on 2.6.25.  Stefano, could you post
that as part of this thread?

Thanks,

John
-- 
John W. Linville
linville@tuxdriver.com
--

From: Chris Clayton
Date: Wednesday, February 27, 2008 - 10:45 am

Yes, that's correct. The patch doesn't apply cleanly to -rc3 and,

That would be very helpful.



-- 
Beauty is in the eye of the beerholder.
--

From: Stefano Brivio
Date: Sunday, March 2, 2008 - 3:33 am

On Wed, 27 Feb 2008 12:25:46 -0500

Sorry for the delay. This is based on 2.6.25-rc3. Please test.

---

Merge rate_control_pid_shift_adjust() to rate_control_pid_adjust_rate()
in order to make the learning algorithm aware of constraints on rates. Also
add some comments and rename variables.

This fixes a bug which prevented 802.11b/g non-AP STAs from working with
802.11b only AP STAs.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
---
Index: linux-2.6.24/net/mac80211/rc80211_pid_algo.c
===================================================================
--- linux-2.6.24.orig/net/mac80211/rc80211_pid_algo.c
+++ linux-2.6.24/net/mac80211/rc80211_pid_algo.c
@@ -2,7 +2,7 @@
  * Copyright 2002-2005, Instant802 Networks, Inc.
  * Copyright 2005, Devicescape Software, Inc.
  * Copyright 2007, Mattias Nissler <mattias.nissler@gmx.de>
- * Copyright 2007, Stefano Brivio <stefano.brivio@polimi.it>
+ * Copyright 2007-2008, Stefano Brivio <stefano.brivio@polimi.it>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -63,72 +63,66 @@
  * RC_PID_ARITH_SHIFT.
  */
 
-
-/* Shift the adjustment so that we won't switch to a lower rate if it exhibited
- * a worse failed frames behaviour and we'll choose the highest rate whose
- * failed frames behaviour is not worse than the one of the original rate
- * target. While at it, check that the adjustment is within the ranges. Then,
- * provide the new rate index. */
-static int rate_control_pid_shift_adjust(struct rc_pid_rateinfo *r,
-					 int adj, int cur, int l)
-{
-	int i, j, k, tmp;
-
-	j = r[cur].rev_index;
-	i = j + adj;
-
-	if (i < 0)
-		return r[0].index;
-	if (i >= l - 1)
-		return r[l - 1].index;
-
-	tmp = i;
-
-	if (adj < 0) {
-		for (k = j; k >= i; k--)
-			if (r[k].diff <= r[j].diff)
-				tmp = k;
-	} else {
-		for (k = i + 1; k + i < l; k++)
-			if (r[k].diff <= r[i].diff)
-				tmp = ...
From: Chris Clayton
Date: Sunday, March 2, 2008 - 8:11 am

I've tested this with the pid algorithm selected and my wireless
network connection is reliable. In a loop, I repeatedly ftp'd a kernel
source tarball from another box on my network for 40 minutes with no
failure, whilst at the same time, pinging my gateway. Without the
patch, that activity would have led to network failure in seconds. The
tests were with the patch applied to 2.6.25-rc3-git3, so that Ivo's
rt2x00 patches for 2.6.25 are also applied.




-- 
Beauty is in the eye of the beerholder.
--

Previous thread: git-xtensa doesn't like my cross compiler by Adrian Bunk on Saturday, February 16, 2008 - 3:46 am. (3 messages)

Next thread: [RFC] bitmap onto and fold operators for mempolicy extensions by Paul Jackson on Saturday, February 16, 2008 - 5:33 am. (7 messages)