login
Header Space

 
 

[resubmit] Add GEM to i915 DRM driver

Previous thread: none

Next thread: [PATCH 1/1] selinux: add support for installing a dummy policy (v2) by Serge E. Hallyn on Tuesday, August 26, 2008 - 3:47 pm. (2 messages)
To: <linux-kernel@...>
Date: Tuesday, August 26, 2008 - 3:43 pm

This patch series brings a long-awaited kernel memory manager to the i915
driver.  This will allow us to do correct composited OpenGL, speed up
OpenGL-based compositing, and enable framebuffer objects and other "new"
OpenGL extensions.  This patchset is also being built to enable kernel
modesetting for a non-root, flicker-free X Server.

This is a re-submit of the changes for DRM-GEM.  It relies on patches submitted
by airlied which are currently queued in linux-next.  The tree still has all
the changes required, based off of 2.6.27-rc4.

git://people.freedesktop.org/~anholt/linux-2.6 on the drm-gem-merge branch
http://cgit.freedesktop.org/~anholt/linux-2.6/log/?h=drm-gem-merge

New in this edition since the original submission:

- Exporting kmap_atomic_pfn.
  (The previous submission was slow because it was checking for the
  drm_compat.c version of this function from the external tree)
- shmem_getpage usage replaced with read_mapping_page.
- fixes for software fallbacks on tiled buffers.
- speedups for software fallbacks.
- replaced pci_read_base usage with using the MCHBAR mirror aperture
- fixed some issues on X server exit

What's not new in this edition:

Still using shmem_file_setup.  We need to be able to allocate objects from
the kernel, and didn't get any clear agreement that doing a VFS dance would be
preferable to letting us behave like other kernel subsystems and use the
function.

Still using small integers to identify our objects rather than fds.
We need more than just basic syscalls on the objects -- the alternate mmap
issue is more serious than before, for X pixmap usage.  There are also the
ioctls for cache management for software fallbacks.  And the issue of
getting high fds and large numbers of them still remained.

Still have an issue with PAT on x86_64 -- initialization fails because
ioremap() apparently has different semantics there than on x86 or non-PAT
x86_64 (if somebody has mapped the space and there's a WC MTRR, the ioremap
that defaults to UC- ...
To: <linux-kernel@...>
Cc: Keith Packard <keithp@...>, Eric Anholt <eric@...>
Date: Tuesday, August 26, 2008 - 3:43 pm

From: Keith Packard &lt;keithp@keithp.com&gt;

GEM needs to create shmem files to back buffer objects.  Though currently
creation of files for objects could have been driven from userland, the
modesetting work will require allocation of buffer objects before userland
is running, for boot-time message display.

Signed-off-by: Eric Anholt &lt;eric@anholt.net&gt;
---
 mm/shmem.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 04fb4f1..515909d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2582,6 +2582,7 @@ put_memory:
 	shmem_unacct_size(flags, size);
 	return ERR_PTR(error);
 }
+EXPORT_SYMBOL(shmem_file_setup);
 
 /**
  * shmem_zero_setup - setup a shared anonymous mapping
-- 
1.5.6.3

--
To: <linux-kernel@...>
Cc: Eric Anholt <eric@...>
Date: Tuesday, August 26, 2008 - 3:43 pm

The driver would like to map IO space directly for copying data in when
appropriate, to avoid CPU cache flushing for streaming writes.
kmap_atomic_pfn lets us avoid IPIs associated with ioremap for this process.

Signed-off-by: Eric Anholt &lt;eric@anholt.net&gt;
---
 arch/x86/mm/highmem_32.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 165c871..d52e91d 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -137,6 +137,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
 
 	return (void*) vaddr;
 }
+EXPORT_SYMBOL(kmap_atomic_pfn);
 
 struct page *kmap_atomic_to_page(void *ptr)
 {
-- 
1.5.6.3

--
To: Eric Anholt <eric@...>, Dave Airlie <airlied@...>, <linux-arch@...>
Cc: <linux-kernel@...>
Date: Wednesday, August 27, 2008 - 9:36 am

I wonder if you ever tested my vmap rework patches with this issue? It
seems somewhat x86 specific and also not conceptually so clean to use
kmap_atomic_pfn for this. vmap may not be used by all architectures but
I think it might be able to cover some of them.

As I said, there are some other possible improvements that can be made
to my vmap rewrite if performance isn't good enough, but I simply have
not seen numbers...

Thanks,
Nick
--
To: Nick Piggin <nickpiggin@...>
Cc: Dave Airlie <airlied@...>, <linux-arch@...>, <linux-kernel@...>
Date: Wednesday, August 27, 2008 - 12:52 pm

The consumer of this is a driver for Intel platforms, so being
x86-specific is not a worry this patch series.

However, when other DRM drivers get around to doing memory management,
I'm sure they'll also be interested in an ioremap_wc that doesn't eat
ipi costs.  For us, the ipis for flushing were eating over 10% of CPU
time.  If your patch series cuts that cost, we could drop this piece at
that point.

--=20
Eric Anholt
eric@anholt.net                         eric.anholt@intel.com
To: Eric Anholt <eric@...>
Cc: Dave Airlie <airlied@...>, <linux-arch@...>, <linux-kernel@...>
Date: Wednesday, August 27, 2008 - 8:22 pm

It would help verify and improve the new vmap code, and it would be
"doing the right thing" to begin with. It would avoid some nasty
ifdefery in your driver too. And what about 64 bit x86 that doesn't

It can cut the cost quite significantly on normal vmap/vunmap loads I
tested. Whether it will work as well on your workload, I don't know
but I would have liked to find out. I raised this issue quite a while
back, so I'm disappointed it had not been tried...

--
To: Nick Piggin <nickpiggin@...>
Cc: Eric Anholt <eric@...>, Dave Airlie <airlied@...>, <linux-arch@...>, <linux-kernel@...>
Date: Monday, September 22, 2008 - 1:59 pm

I think Eric has code with the vmap changes now?

Given our discussions at KS/Plumbers would you be ok with acking these 
patches?  Or do you want a repost so you can check out the vmap stuff?

After talking a bit more about it, I think we agreed that the ioctl interface 
is actually a better approach then trying to shoehorn this stuff into system 
calls, so aside from the vmap code (which could be done in 2.6.29 or whenever 
the vmap stuff lands) I think this patchset is pretty close to what we want 
in drm-next now...

Thanks,
Jesse
--
To: Jesse Barnes <jbarnes@...>
Cc: Nick Piggin <nickpiggin@...>, Dave Airlie <airlied@...>, <linux-arch@...>, <linux-kernel@...>
Date: Tuesday, September 23, 2008 - 12:32 pm

I've been trying to get a good test of the vmap changes.  Unfortunately,
it looks like things have changed in ioremap in the intervening time
between when I last tested and now.  I was taking 2.6.27-rc5-mm1 (since
that was the only git tree I found with Nick's changes in it,
unfortunately) and merging our stuff into there then reverting the
kmap_atomic_prot_pfn bit.

However, we've moved from 10% cost in unmapping due to IPIing to a &gt;30%
cost in mapping due to change_page_attr.  This is regardless of whether
I do ioremap or ioremap_wc.  The physical range is covered by a WC MTRR,
and the X Server's mapping the memory using the _wc resource.  In the
DRM, I just tried moving every ioremap to _wc, and it's the same.  The
call chain looks like

i915_gem_pwrite_ioctl (60%)
ioremap_wc (39%)
ioremap_nocache (39%)
ioremap_caller (39%)
ioremap_change_attr (36%)
_set_memory_uc (36%)
change_page_attr_set_clr (36%)
vm_unmap_aliases (31%)

Maybe if I go back and generate a tree of just Nick's changes and try to
merge them forward I can get a good test.  However, it looks to me like
we've got some serious brokenness in ioremap and attribute handling
coming up.

--=20
Eric Anholt
eric@anholt.net                         eric.anholt@intel.com
To: <linux-kernel@...>
Cc: Eric Anholt <eric@...>
Date: Tuesday, August 26, 2008 - 3:43 pm

GEM allows the creation of persistent buffer objects accessible by the
graphics device through new ioctls for managing execution of commands on the
device.  The userland API is almost entirely driver-specific to ensure that
any driver building on this model can easily map the interface to individual
driver requirements.

GEM is used by the 2d driver for managing its internal state allocations and
will be used for pixmap storage to reduce memory consumption and enable
zero-copy GLX_EXT_texture_from_pixmap, and in the 3d driver is used to enable
GL_EXT_framebuffer_object and GL_ARB_pixel_buffer_object.

Signed-off-by: Eric Anholt &lt;eric@anholt.net&gt;
---
 drivers/gpu/drm/Makefile               |    5 +-
 drivers/gpu/drm/drm_agpsupport.c       |   51 +-
 drivers/gpu/drm/drm_cache.c            |   76 +
 drivers/gpu/drm/drm_drv.c              |    4 +
 drivers/gpu/drm/drm_fops.c             |    6 +
 drivers/gpu/drm/drm_gem.c              |  420 ++++++
 drivers/gpu/drm/drm_memory.c           |    2 +
 drivers/gpu/drm/drm_mm.c               |    5 +-
 drivers/gpu/drm/drm_proc.c             |  135 ++-
 drivers/gpu/drm/drm_stub.c             |   10 +
 drivers/gpu/drm/i915/Makefile          |    6 +-
 drivers/gpu/drm/i915/i915_dma.c        |   94 +-
 drivers/gpu/drm/i915/i915_drv.c        |    8 +-
 drivers/gpu/drm/i915/i915_drv.h        |  253 ++++-
 drivers/gpu/drm/i915/i915_gem.c        | 2509 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_debug.c  |  201 +++
 drivers/gpu/drm/i915/i915_gem_proc.c   |  292 ++++
 drivers/gpu/drm/i915/i915_gem_tiling.c |  256 ++++
 drivers/gpu/drm/i915/i915_irq.c        |    8 +-
 drivers/gpu/drm/i915/i915_reg.h        |   37 +-
 include/drm/drm.h                      |   31 +
 include/drm/drmP.h                     |  151 ++
 include/drm/i915_drm.h                 |  332 +++++
 23 files changed, 4835 insertions(+), 57 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_cache.c
 create mode 100644 drivers/gpu/drm/drm_gem.c
 creat...
Previous thread: none

Next thread: [PATCH 1/1] selinux: add support for installing a dummy policy (v2) by Serge E. Hallyn on Tuesday, August 26, 2008 - 3:47 pm. (2 messages)
speck-geostationary