DRM Memory Management¶
Modern Linux systems require large amount of graphics memory to store frame buffers, textures, vertices and other graphics-related data. Given the very dynamic nature of many of that data, managing graphics memory efficiently is thus crucial for the graphics stack and plays a central role in the DRM infrastructure.
The DRM core includes two memory managers, namely Translation Table Maps (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory manager to be developed and tried to be a one-size-fits-them all solution. It provides a single userspace API to accommodate the need of all hardware, supporting both Unified Memory Architecture (UMA) devices and devices with dedicated video RAM (i.e. most discrete video cards). This resulted in a large, complex piece of code that turned out to be hard to use for driver development.
GEM started as an Intel-sponsored project in reaction to TTM’s complexity. Its design philosophy is completely different: instead of providing a solution to every graphics memory-related problems, GEM identified common code between drivers and created a support library to share it. GEM has simpler initialization and execution requirements than TTM, but has no video RAM management capabilities and is thus limited to UMA devices.
The Translation Table Manager (TTM)¶
TTM design background and information belongs here.
TTM initialization¶
Warning This section is outdated.
Drivers wishing to support TTM must pass a filled ttm_bo_driver
structure to ttm_bo_device_init, together with an
initialized global reference to the memory manager. The ttm_bo_driver
structure contains several fields with function pointers for
initializing the TTM, allocating and freeing memory, waiting for command
completion and fence synchronization, and memory migration.
The struct drm_global_reference
is made
up of several fields:
struct drm_global_reference {
enum ttm_global_types global_type;
size_t size;
void *object;
int (*init) (struct drm_global_reference *);
void (*release) (struct drm_global_reference *);
};
There should be one global reference structure for your memory manager as a whole, and there will be others for each object created by the memory manager at runtime. Your global TTM should have a type of TTM_GLOBAL_TTM_MEM. The size field for the global object should be sizeof(struct ttm_mem_global), and the init and release hooks should point at your driver-specific init and release routines, which probably eventually call ttm_mem_global_init and ttm_mem_global_release, respectively.
Once your global TTM accounting structure is set up and initialized by calling ttm_global_item_ref() on it, you need to create a buffer object TTM to provide a pool for buffer object allocation by clients and the kernel itself. The type of this object should be TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct ttm_bo_global). Again, driver-specific init and release functions may be provided, likely eventually calling ttm_bo_global_init() and ttm_bo_global_release(), respectively. Also, like the previous object, ttm_global_item_ref() is used to create an initial reference count for the TTM, which will call your initialization function.
See the radeon_ttm.c file for an example of usage.
Error
kernel-doc missing
The Graphics Execution Manager (GEM)¶
The GEM design approach has resulted in a memory manager that doesn’t provide full coverage of all (or even all common) use cases in its userspace or kernel API. GEM exposes a set of standard memory-related operations to userspace and a set of helper functions to drivers, and let drivers implement hardware-specific operations with their own private API.
The GEM userspace API is described in the GEM - the Graphics Execution Manager article on LWN. While slightly outdated, the document provides a good overview of the GEM API principles. Buffer allocation and read and write operations, described as part of the common GEM API, are currently implemented using driver-specific ioctls.
GEM is data-agnostic. It manages abstract buffer objects without knowing what individual buffers contain. APIs that require knowledge of buffer contents or purpose, such as buffer allocation or synchronization primitives, are thus outside of the scope of GEM and must be implemented using driver-specific ioctls.
On a fundamental level, GEM involves several operations:
- Memory allocation and freeing
- Command execution
- Aperture management at command execution time
Buffer object allocation is relatively straightforward and largely provided by Linux’s shmem layer, which provides memory to back each object.
Device-specific operations, such as command execution, pinning, buffer read & write, mapping, and domain ownership transfers are left to driver-specific ioctls.
GEM Initialization¶
Drivers that use GEM must set the DRIVER_GEM bit in the struct
struct drm_driver
driver_features
field. The DRM core will then automatically initialize the GEM core
before calling the load operation. Behind the scene, this will create a
DRM Memory Manager object which provides an address space pool for
object allocation.
In a KMS configuration, drivers need to allocate and initialize a command ring buffer following core GEM initialization if required by the hardware. UMA devices usually have what is called a “stolen” memory region, which provides space for the initial framebuffer and large, contiguous memory regions required by the device. This space is typically not managed by GEM, and must be initialized separately into its own DRM MM object.
GEM Objects Creation¶
GEM splits creation of GEM objects and allocation of the memory that backs them in two distinct operations.
GEM objects are represented by an instance of struct struct
drm_gem_object
. Drivers usually need to
extend GEM objects with private information and thus create a
driver-specific GEM object structure type that embeds an instance of
struct struct drm_gem_object
.
To create a GEM object, a driver allocates memory for an instance of its
specific GEM object type and initializes the embedded struct
struct drm_gem_object
with a call
to drm_gem_object_init()
. The function takes a pointer
to the DRM device, a pointer to the GEM object and the buffer object
size in bytes.
GEM uses shmem to allocate anonymous pageable memory.
drm_gem_object_init()
will create an shmfs file of the
requested size and store it into the struct struct
drm_gem_object
filp field. The memory is
used as either main storage for the object when the graphics hardware
uses system memory directly or as a backing store otherwise.
Drivers are responsible for the actual physical pages allocation by
calling shmem_read_mapping_page_gfp()
for each page.
Note that they can decide to allocate pages when initializing the GEM
object, or to delay allocation until the memory is needed (for instance
when a page fault occurs as a result of a userspace memory access or
when the driver needs to start a DMA transfer involving the memory).
Anonymous pageable memory allocation is not always desired, for instance
when the hardware requires physically contiguous system memory as is
often the case in embedded devices. Drivers can create GEM objects with
no shmfs backing (called private GEM objects) by initializing them with
a call to drm_gem_private_object_init()
instead of
drm_gem_object_init()
. Storage for private GEM objects
must be managed by drivers.
GEM Objects Lifetime¶
All GEM objects are reference-counted by the GEM core. References can be
acquired and release by calling drm_gem_object_get()
and
drm_gem_object_put()
respectively. The caller must hold the
struct drm_device
struct_mutex lock when calling
drm_gem_object_get()
. As a convenience, GEM provides
drm_gem_object_put_unlocked()
functions that can be called without
holding the lock.
When the last reference to a GEM object is released the GEM core calls
the struct drm_driver
gem_free_object_unlocked
operation. That operation is mandatory for GEM-enabled drivers and must
free the GEM object and all associated resources.
void (*gem_free_object) (struct drm_gem_object *obj); Drivers are
responsible for freeing all GEM object resources. This includes the
resources created by the GEM core, which need to be released with
drm_gem_object_release()
.
GEM Objects Naming¶
Communication between userspace and the kernel refers to GEM objects using local handles, global names or, more recently, file descriptors. All of those are 32-bit integer values; the usual Linux kernel limits apply to the file descriptors.
GEM handles are local to a DRM file. Applications get a handle to a GEM object through a driver-specific ioctl, and can use that handle to refer to the GEM object in other standard or driver-specific ioctls. Closing a DRM file handle frees all its GEM handles and dereferences the associated GEM objects.
To create a handle for a GEM object drivers call
drm_gem_handle_create()
. The function takes a pointer
to the DRM file and the GEM object and returns a locally unique handle.
When the handle is no longer needed drivers delete it with a call to
drm_gem_handle_delete()
. Finally the GEM object
associated with a handle can be retrieved by a call to
drm_gem_object_lookup()
.
Handles don’t take ownership of GEM objects, they only take a reference to the object that will be dropped when the handle is destroyed. To avoid leaking GEM objects, drivers must make sure they drop the reference(s) they own (such as the initial reference taken at object creation time) as appropriate, without any special consideration for the handle. For example, in the particular case of combined GEM object and handle creation in the implementation of the dumb_create operation, drivers must drop the initial reference to the GEM object before returning the handle.
GEM names are similar in purpose to handles but are not local to DRM files. They can be passed between processes to reference a GEM object globally. Names can’t be used directly to refer to objects in the DRM API, applications must convert handles to names and names to handles using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls respectively. The conversion is handled by the DRM core without any driver-specific support.
GEM also supports buffer sharing with dma-buf file descriptors through PRIME. GEM-based drivers must use the provided helpers functions to implement the exporting and importing correctly. See ?. Since sharing file descriptors is inherently more secure than the easily guessable and global GEM names it is the preferred buffer sharing mechanism. Sharing buffers through GEM names is only supported for legacy userspace. Furthermore PRIME also allows cross-device buffer sharing since it is based on dma-bufs.
GEM Objects Mapping¶
Because mapping operations are fairly heavyweight GEM favours read/write-like access to buffers, implemented through driver-specific ioctls, over mapping buffers to userspace. However, when random access to the buffer is needed (to perform software rendering for instance), direct access to the object can be more efficient.
The mmap system call can’t be used directly to map GEM objects, as they
don’t have their own file handle. Two alternative methods currently
co-exist to map GEM objects to userspace. The first method uses a
driver-specific ioctl to perform the mapping operation, calling
do_mmap()
under the hood. This is often considered
dubious, seems to be discouraged for new GEM-enabled drivers, and will
thus not be described here.
The second method uses the mmap system call on the DRM file handle. void
*mmap(void *addr, size_t length, int prot, int flags, int fd, off_t
offset); DRM identifies the GEM object to be mapped by a fake offset
passed through the mmap offset argument. Prior to being mapped, a GEM
object must thus be associated with a fake offset. To do so, drivers
must call drm_gem_create_mmap_offset()
on the object.
Once allocated, the fake offset value must be passed to the application in a driver-specific way and can then be used as the mmap offset argument.
The GEM core provides a helper method drm_gem_mmap()
to
handle object mapping. The method can be set directly as the mmap file
operation handler. It will look up the GEM object based on the offset
value and set the VMA operations to the struct drm_driver
gem_vm_ops field. Note that
drm_gem_mmap()
doesn’t map memory to userspace, but
relies on the driver-provided fault handler to map pages individually.
To use drm_gem_mmap()
, drivers must fill the struct
struct drm_driver
gem_vm_ops field
with a pointer to VM operations.
The VM operations is a struct vm_operations_struct
made up of several fields, the more interesting ones being:
struct vm_operations_struct {
void (*open)(struct vm_area_struct * area);
void (*close)(struct vm_area_struct * area);
int (*fault)(struct vm_fault *vmf);
};
The open and close operations must update the GEM object reference
count. Drivers can use the drm_gem_vm_open()
and
drm_gem_vm_close()
helper functions directly as open
and close handlers.
The fault operation handler is responsible for mapping individual pages to userspace when a page fault occurs. Depending on the memory allocation scheme, drivers can allocate pages at fault time, or can decide to allocate memory for the GEM object at the time the object is created.
Drivers that want to map the GEM object upfront instead of handling page faults can implement their own mmap file operation handler.
For platforms without MMU the GEM core provides a helper method
drm_gem_cma_get_unmapped_area()
. The mmap() routines will call
this to get a proposed address for the mapping.
To use drm_gem_cma_get_unmapped_area()
, drivers must fill the
struct struct file_operations
get_unmapped_area
field with a pointer on drm_gem_cma_get_unmapped_area()
.
More detailed information about get_unmapped_area can be found in Documentation/nommu-mmap.txt
Memory Coherency¶
When mapped to the device or used in a command buffer, backing pages for an object are flushed to memory and marked write combined so as to be coherent with the GPU. Likewise, if the CPU accesses an object after the GPU has finished rendering to the object, then the object must be made coherent with the CPU’s view of memory, usually involving GPU cache flushing of various kinds. This core CPU<->GPU coherency management is provided by a device-specific ioctl, which evaluates an object’s current domain and performs any necessary flushing or synchronization to put the object into the desired coherency domain (note that the object may be busy, i.e. an active render target; in that case, setting the domain blocks the client and waits for rendering to complete before performing any necessary flushing operations).
Command Execution¶
Perhaps the most important GEM function for GPU devices is providing a command execution interface to clients. Client programs construct command buffers containing references to previously allocated memory objects, and then submit them to GEM. At that point, GEM takes care to bind all the objects into the GTT, execute the buffer, and provide necessary synchronization between clients accessing the same buffers. This often involves evicting some objects from the GTT and re-binding others (a fairly expensive operation), and providing relocation support which hides fixed GTT offsets from clients. Clients must take care not to submit command buffers that reference more objects than can fit in the GTT; otherwise, GEM will reject them and no rendering will occur. Similarly, if several objects in the buffer require fence registers to be allocated for correct rendering (e.g. 2D blits on pre-965 chips), care must be taken not to require more fence registers than are available to the client. Such resource management should be abstracted from the client in libdrm.
GEM Function Reference¶
-
struct
drm_gem_object_funcs
¶ GEM object functions
Definition
struct drm_gem_object_funcs {
void (*free)(struct drm_gem_object *obj);
int (*open)(struct drm_gem_object *obj, struct drm_file *file);
void (*close)(struct drm_gem_object *obj, struct drm_file *file);
void (*print_info)(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj);
struct dma_buf *(*export)(struct drm_gem_object *obj, int flags);
int (*pin)(struct drm_gem_object *obj);
void (*unpin)(struct drm_gem_object *obj);
struct sg_table *(*get_sg_table)(struct drm_gem_object *obj);
int (*vmap)(struct drm_gem_object *obj, struct iosys_map *map);
void (*vunmap)(struct drm_gem_object *obj, struct iosys_map *map);
int (*mmap)(struct drm_gem_object *obj, struct vm_area_struct *vma);
const struct vm_operations_struct *vm_ops;
};
Members
free
Deconstructor for drm_gem_objects.
This callback is mandatory.
open
Called upon GEM handle creation.
This callback is optional.
close
Called upon GEM handle release.
This callback is optional.
print_info
If driver subclasses struct
drm_gem_object
, it can implement this optional hook for printing additional driver specific info.drm_printf_indent()
should be used in the callback passing it the indent argument.This callback is called from
drm_gem_print_info()
.This callback is optional.
export
Export backing buffer as a
dma_buf
. If this is not setdrm_gem_prime_export()
is used.This callback is optional.
pin
Pin backing buffer in memory. Used by the
drm_gem_map_attach()
helper.This callback is optional.
unpin
Unpin backing buffer. Used by the
drm_gem_map_detach()
helper.This callback is optional.
get_sg_table
Returns a Scatter-Gather table representation of the buffer. Used when exporting a buffer by the
drm_gem_map_dma_buf()
helper. Releasing is done by callingdma_unmap_sg_attrs()
andsg_free_table()
indrm_gem_unmap_buf()
, therefore these helpers and this callback here cannot be used for sg tables pointing at driver private memory ranges.See also
drm_prime_pages_to_sg()
.vmap
Returns a virtual address for the buffer. Used by the
drm_gem_dmabuf_vmap()
helper.This callback is optional.
vunmap
Releases the address previously returned by vmap. Used by the
drm_gem_dmabuf_vunmap()
helper.This callback is optional.
mmap
Handle
mmap()
of the gem object, setup vma accordingly.This callback is optional.
The callback is used by both
drm_gem_mmap_obj()
anddrm_gem_prime_mmap()
. When mmap is present vm_ops is not used, the mmap callback must set vma->vm_ops instead.vm_ops
Virtual memory operations used with mmap.
This is optional but necessary for mmap support.
-
struct
drm_gem_lru
¶ A simple LRU helper
Definition
struct drm_gem_lru {
struct mutex *lock;
long count;
struct list_head list;
};
Members
lock
- Lock protecting movement of GEM objects between LRUs. All LRUs that the object can move between should be protected by the same lock.
count
- The total number of backing pages of the GEM objects in this LRU.
list
- The LRU list.
Description
A helper for tracking GEM objects in a given state, to aid in
driver’s shrinker implementation. Tracks the count of pages
for lockless shrinker.count_objects
, and provides
drm_gem_lru_scan
for driver’s shrinker.scan_objects
implementation.
-
struct
drm_gem_object
¶ GEM buffer object
Definition
struct drm_gem_object {
struct kref refcount;
unsigned handle_count;
struct drm_device *dev;
struct file *filp;
struct drm_vma_offset_node vma_node;
size_t size;
int name;
struct dma_buf *dma_buf;
struct dma_buf_attachment *import_attach;
struct dma_resv *resv;
struct dma_resv _resv;
const struct drm_gem_object_funcs *funcs;
struct list_head lru_node;
struct drm_gem_lru *lru;
};
Members
refcount
Reference count of this object
Please use
drm_gem_object_get()
to acquire anddrm_gem_object_put_locked()
ordrm_gem_object_put()
to release a reference to a GEM buffer object.handle_count
This is the GEM file_priv handle count of this object.
Each handle also holds a reference. Note that when the handle_count drops to 0 any global names (e.g. the id in the flink namespace) will be cleared.
Protected by
drm_device.object_name_lock
.dev
- DRM dev this object belongs to.
filp
- SHMEM file node used as backing storage for swappable buffer objects. GEM also supports driver private objects with driver-specific backing storage (contiguous DMA memory, special reserved blocks). In this case filp is NULL.
vma_node
Mapping info for this object to support mmap. Drivers are supposed to allocate the mmap offset using
drm_gem_create_mmap_offset()
. The offset itself can be retrieved usingdrm_vma_node_offset_addr()
.Memory mapping itself is handled by
drm_gem_mmap()
, which also checks that userspace is allowed to access the object.size
- Size of the object, in bytes. Immutable over the object’s lifetime.
name
- Global name for this object, starts at 1. 0 means unnamed.
Access is covered by
drm_device.object_name_lock
. This is used by the GEM_FLINK and GEM_OPEN ioctls. dma_buf
dma-buf associated with this GEM object.
Pointer to the dma-buf associated with this gem object (either through importing or exporting). We break the resulting reference loop when the last gem handle for this object is released.
Protected by
drm_device.object_name_lock
.import_attach
dma-buf attachment backing this object.
Any foreign dma_buf imported as a gem object has this set to the attachment point for the device. This is invariant over the lifetime of a gem object.
The
drm_gem_object_funcs.free
callback is responsible for cleaning up the dma_buf attachment and references acquired at import time.Note that the drm gem/prime core does not depend upon drivers setting this field any more. So for drivers where this doesn’t make sense (e.g. virtual devices or a displaylink behind an usb bus) they can simply leave it as NULL.
resv
Pointer to reservation object associated with the this GEM object.
Normally (resv == &**_resv**) except for imported GEM objects.
_resv
A reservation object for this GEM object.
This is unused for imported GEM objects.
funcs
Optional GEM object functions. If this is set, it will be used instead of the corresponding
drm_driver
GEM callbacks.New drivers should use this.
lru_node
- List node in a
drm_gem_lru
. lru
- The current LRU list that the GEM object is on.
Description
This structure defines the generic parts for GEM buffer objects, which are mostly around handling mmap and userspace handles.
Buffer objects are often abbreviated to BO.
-
DRM_GEM_FOPS
()¶ Default drm GEM file operations
Parameters
Description
This macro provides a shorthand for setting the GEM file ops in the
file_operations
structure. If all you need are the default ops, use
DEFINE_DRM_GEM_FOPS instead.
-
DEFINE_DRM_GEM_FOPS
(name)¶ macro to generate file operations for GEM drivers
Parameters
name
- name for the generated structure
Description
This macro autogenerates a suitable struct file_operations
for GEM based
drivers, which can be assigned to drm_driver.fops
. Note that this structure
cannot be shared between drivers, because it contains a reference to the
current module using THIS_MODULE.
Note that the declaration is already marked as static - if you need a non-static version of this you’re probably doing it wrong and will break the THIS_MODULE reference by accident.
-
void
drm_gem_object_get
(struct drm_gem_object * obj)¶ acquire a GEM buffer object reference
Parameters
struct drm_gem_object * obj
- GEM buffer object
Description
This function acquires an additional reference to obj. It is illegal to call this without already holding a reference. No locks required.
-
void
drm_gem_object_put
(struct drm_gem_object * obj)¶ drop a GEM buffer object reference
Parameters
struct drm_gem_object * obj
- GEM buffer object
Description
This releases a reference to obj.
-
int
drm_gem_object_init
(struct drm_device * dev, struct drm_gem_object * obj, size_t size)¶ initialize an allocated shmem-backed GEM object
Parameters
struct drm_device * dev
- drm_device the object should be initialized for
struct drm_gem_object * obj
- drm_gem_object to initialize
size_t size
- object size
Description
Initialize an already allocated GEM object of the specified size with shmfs backing store.
-
void
drm_gem_private_object_init
(struct drm_device * dev, struct drm_gem_object * obj, size_t size)¶ initialize an allocated private GEM object
Parameters
struct drm_device * dev
- drm_device the object should be initialized for
struct drm_gem_object * obj
- drm_gem_object to initialize
size_t size
- object size
Description
Initialize an already allocated GEM object of the specified size with no GEM provided backing store. Instead the caller is responsible for backing the object and handling it.
-
void
drm_gem_private_object_fini
(struct drm_gem_object * obj)¶ Finalize a failed drm_gem_object
Parameters
struct drm_gem_object * obj
- drm_gem_object
Description
Uninitialize an already allocated GEM object when it initialized failed
-
int
drm_gem_handle_delete
(struct drm_file * filp, u32 handle)¶ deletes the given file-private handle
Parameters
struct drm_file * filp
- drm file-private structure to use for the handle look up
u32 handle
- userspace handle to delete
Description
Removes the GEM handle from the filp lookup table which has been added with
drm_gem_handle_create()
. If this is the last handle also cleans up linked
resources like GEM names.
-
int
drm_gem_dumb_map_offset
(struct drm_file * file, struct drm_device * dev, u32 handle, u64 * offset)¶ return the fake mmap offset for a gem object
Parameters
struct drm_file * file
- drm file-private structure containing the gem object
struct drm_device * dev
- corresponding drm_device
u32 handle
- gem object handle
u64 * offset
- return location for the fake mmap offset
Description
This implements the drm_driver.dumb_map_offset
kms driver callback for
drivers which use gem to manage their backing storage.
Return
0 on success or a negative error code on failure.
-
int
drm_gem_handle_create
(struct drm_file * file_priv, struct drm_gem_object * obj, u32 * handlep)¶ create a gem handle for an object
Parameters
struct drm_file * file_priv
- drm file-private structure to register the handle for
struct drm_gem_object * obj
- object to register
u32 * handlep
- pointer to return the created handle to the caller
Description
Create a handle for this object. This adds a handle reference to the object, which includes a regular reference count. Callers will likely want to dereference the object afterwards.
Since this publishes obj to userspace it must be fully set up by this point, drivers must call this last in their buffer object creation callbacks.
-
void
drm_gem_free_mmap_offset
(struct drm_gem_object * obj)¶ release a fake mmap offset for an object
Parameters
struct drm_gem_object * obj
- obj in question
Description
This routine frees fake offsets allocated by drm_gem_create_mmap_offset()
.
Note that drm_gem_object_release()
already calls this function, so drivers
don’t have to take care of releasing the mmap offset themselves when freeing
the GEM object.
-
int
drm_gem_create_mmap_offset_size
(struct drm_gem_object * obj, size_t size)¶ create a fake mmap offset for an object
Parameters
struct drm_gem_object * obj
- obj in question
size_t size
- the virtual size
Description
GEM memory mapping works by handing back to userspace a fake mmap offset it can use in a subsequent mmap(2) call. The DRM core code then looks up the object based on the offset and sets up the various memory mapping structures.
This routine allocates and attaches a fake offset for obj, in cases where
the virtual size differs from the physical size (ie. drm_gem_object.size
).
Otherwise just use drm_gem_create_mmap_offset()
.
This function is idempotent and handles an already allocated mmap offset transparently. Drivers do not need to check for this case.
-
int
drm_gem_create_mmap_offset
(struct drm_gem_object * obj)¶ create a fake mmap offset for an object
Parameters
struct drm_gem_object * obj
- obj in question
Description
GEM memory mapping works by handing back to userspace a fake mmap offset it can use in a subsequent mmap(2) call. The DRM core code then looks up the object based on the offset and sets up the various memory mapping structures.
This routine allocates and attaches a fake offset for obj.
Drivers can call drm_gem_free_mmap_offset()
before freeing obj to release
the fake offset again.
-
struct page **
drm_gem_get_pages
(struct drm_gem_object * obj)¶ helper to allocate backing pages for a GEM object from shmem
Parameters
struct drm_gem_object * obj
- obj in question
Description
This reads the page-array of the shmem-backing storage of the given gem object. An array of pages is returned. If a page is not allocated or swapped-out, this will allocate/swap-in the required pages. Note that the whole object is covered by the page-array and pinned in memory.
Use drm_gem_put_pages()
to release the array and unpin all pages.
This uses the GFP-mask set on the shmem-mapping (see mapping_set_gfp_mask()
).
If you require other GFP-masks, you have to do those allocations yourself.
Note that you are not allowed to change gfp-zones during runtime. That is,
shmem_read_mapping_page_gfp()
must be called with the same gfp_zone(gfp) as
set during initialization. If you have special zone constraints, set them
after drm_gem_object_init()
via mapping_set_gfp_mask()
. shmem-core takes care
to keep pages in the required zone during swap-in.
This function is only valid on objects initialized with
drm_gem_object_init()
, but not for those initialized with
drm_gem_private_object_init()
only.
-
void
drm_gem_put_pages
(struct drm_gem_object * obj, struct page ** pages, bool dirty, bool accessed)¶ helper to free backing pages for a GEM object
Parameters
struct drm_gem_object * obj
- obj in question
struct page ** pages
- pages to free
bool dirty
- if true, pages will be marked as dirty
bool accessed
- if true, the pages will be marked as accessed
-
int
drm_gem_objects_lookup
(struct drm_file * filp, void __user * bo_handles, int count, struct drm_gem_object *** objs_out)¶ look up GEM objects from an array of handles
Parameters
struct drm_file * filp
- DRM file private date
void __user * bo_handles
- user pointer to array of userspace handle
int count
- size of handle array
struct drm_gem_object *** objs_out
- returned pointer to array of drm_gem_object pointers
Description
Takes an array of userspace handles and returns a newly allocated array of GEM objects.
For a single handle lookup, use drm_gem_object_lookup()
.
Return
objs filled in with GEM object pointers. Returned GEM objects need to be
released with drm_gem_object_put()
. -ENOENT is returned on a lookup
failure. 0 is returned on success.
-
struct drm_gem_object *
drm_gem_object_lookup
(struct drm_file * filp, u32 handle)¶ look up a GEM object from its handle
Parameters
struct drm_file * filp
- DRM file private date
u32 handle
- userspace handle
Return
A reference to the object named by the handle if such exists on filp, NULL otherwise.
If looking up an array of handles, use drm_gem_objects_lookup()
.
-
long
drm_gem_dma_resv_wait
(struct drm_file * filep, u32 handle, bool wait_all, unsigned long timeout)¶ Wait on GEM object’s reservation’s objects shared and/or exclusive fences.
Parameters
struct drm_file * filep
- DRM file private date
u32 handle
- userspace handle
bool wait_all
- if true, wait on all fences, else wait on just exclusive fence
unsigned long timeout
- timeout value in jiffies or zero to return immediately
Return
Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or greater than 0 on success.
-
void
drm_gem_object_release
(struct drm_gem_object * obj)¶ release GEM buffer object resources
Parameters
struct drm_gem_object * obj
- GEM buffer object
Description
This releases any structures and resources used by obj and is the inverse of
drm_gem_object_init()
.
-
void
drm_gem_object_free
(struct kref * kref)¶ free a GEM object
Parameters
struct kref * kref
- kref of the object to free
Description
Called after the last reference to the object has been lost.
Frees the object
-
void
drm_gem_vm_open
(struct vm_area_struct * vma)¶ vma->ops->open implementation for GEM
Parameters
struct vm_area_struct * vma
- VM area structure
Description
This function implements the #vm_operations_struct open()
callback for GEM
drivers. This must be used together with drm_gem_vm_close()
.
-
void
drm_gem_vm_close
(struct vm_area_struct * vma)¶ vma->ops->close implementation for GEM
Parameters
struct vm_area_struct * vma
- VM area structure
Description
This function implements the #vm_operations_struct close()
callback for GEM
drivers. This must be used together with drm_gem_vm_open()
.
-
int
drm_gem_mmap_obj
(struct drm_gem_object * obj, unsigned long obj_size, struct vm_area_struct * vma)¶ memory map a GEM object
Parameters
struct drm_gem_object * obj
- the GEM object to map
unsigned long obj_size
- the object size to be mapped, in bytes
struct vm_area_struct * vma
- VMA for the area to be mapped
Description
Set up the VMA to prepare mapping of the GEM object using the GEM object’s vm_ops. Depending on their requirements, GEM objects can either provide a fault handler in their vm_ops (in which case any accesses to the object will be trapped, to perform migration, GTT binding, surface register allocation, or performance monitoring), or mmap the buffer memory synchronously after calling drm_gem_mmap_obj.
This function is mainly intended to implement the DMABUF mmap operation, when
the GEM object is not looked up based on its fake offset. To implement the
DRM mmap operation, drivers should use the drm_gem_mmap()
function.
drm_gem_mmap_obj()
assumes the user is granted access to the buffer while
drm_gem_mmap()
prevents unprivileged users from mapping random objects. So
callers must verify access restrictions before calling this helper.
Return 0 or success or -EINVAL if the object size is smaller than the VMA size, or if no vm_ops are provided.
-
int
drm_gem_mmap
(struct file * filp, struct vm_area_struct * vma)¶ memory map routine for GEM objects
Parameters
struct file * filp
- DRM file pointer
struct vm_area_struct * vma
- VMA for the area to be mapped
Description
If a driver supports GEM object mapping, mmap calls on the DRM file descriptor will end up here.
Look up the GEM object based on the offset passed in (vma->vm_pgoff will
contain the fake offset we created when the GTT map ioctl was called on
the object) and map it with a call to drm_gem_mmap_obj()
.
If the caller is not granted access to the buffer object, the mmap will fail with EACCES. Please see the vma manager for more information.
-
int
drm_gem_lock_reservations
(struct drm_gem_object ** objs, int count, struct ww_acquire_ctx * acquire_ctx)¶ Sets up the ww context and acquires the lock on an array of GEM objects.
Parameters
struct drm_gem_object ** objs
- drm_gem_objects to lock
int count
- Number of objects in objs
struct ww_acquire_ctx * acquire_ctx
- struct ww_acquire_ctx that will be initialized as part of tracking this set of locked reservations.
Description
Once you’ve locked your reservations, you’ll want to set up space
for your shared fences (if applicable), submit your job, then
drm_gem_unlock_reservations()
.
-
void
drm_gem_lru_init
(struct drm_gem_lru * lru, struct mutex * lock)¶ initialize a LRU
Parameters
struct drm_gem_lru * lru
- The LRU to initialize
struct mutex * lock
- The lock protecting the LRU
-
void
drm_gem_lru_remove
(struct drm_gem_object * obj)¶ remove object from whatever LRU it is in
Parameters
struct drm_gem_object * obj
- The GEM object to remove from current LRU
Description
If the object is currently in any LRU, remove it.
-
void
drm_gem_lru_move_tail
(struct drm_gem_lru * lru, struct drm_gem_object * obj)¶ move the object to the tail of the LRU
Parameters
struct drm_gem_lru * lru
- The LRU to move the object into.
struct drm_gem_object * obj
- The GEM object to move into this LRU
Description
If the object is already in this LRU it will be moved to the tail. Otherwise it will be removed from whichever other LRU it is in (if any) and moved into this LRU.
-
unsigned long
drm_gem_lru_scan
(struct drm_gem_lru * lru, unsigned int nr_to_scan, unsigned long * remaining, bool (*shrink) (struct drm_gem_object *obj)¶ helper to implement shrinker.scan_objects
Parameters
struct drm_gem_lru * lru
- The LRU to scan
unsigned int nr_to_scan
- The number of pages to try to reclaim
unsigned long * remaining
- The number of pages left to reclaim, should be initialized by caller
bool (*)(struct drm_gem_object *obj) shrink
- Callback to try to shrink/reclaim the object.
Description
If the shrink callback succeeds, it is expected that the driver move the object out of this LRU.
If the LRU possibly contain active buffers, it is the responsibility
of the shrink callback to check for this (ie. dma_resv_test_signaled()
)
or if necessary block until the buffer becomes idle.
GEM CMA Helper Functions Reference¶
Error
kernel-doc missing
Error
kernel-doc missing
Error
kernel-doc missing
VMA Offset Manager¶
The vma-manager is responsible to map arbitrary driver-dependent memory regions into the linear user address-space. It provides offsets to the caller which can then be used on the address_space of the drm-device. It takes care to not overlap regions, size them appropriately and to not confuse mm-core by inconsistent fake vm_pgoff fields. Drivers shouldn’t use this for object placement in VMEM. This manager should only be used to manage mappings into linear user-space VMs.
We use drm_mm as backend to manage object allocations. But it is highly optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to speed up offset lookups.
You must not use multiple offset managers on a single address_space. Otherwise, mm-core will be unable to tear down memory mappings as the VM will no longer be linear.
This offset manager works on page-based addresses. That is, every argument
and return code (with the exception of drm_vma_node_offset_addr()
) is given
in number of pages, not number of bytes. That means, object sizes and offsets
must always be page-aligned (as usual).
If you want to get a valid byte-based user-space address for a given offset,
please see drm_vma_node_offset_addr()
.
Additionally to offset management, the vma offset manager also handles access
management. For every open-file context that is allowed to access a given
node, you must call drm_vma_node_allow()
. Otherwise, an mmap()
call on this
open-file with the offset of the node will fail with -EACCES. To revoke
access again, use drm_vma_node_revoke()
. However, the caller is responsible
for destroying already existing mappings, if required.
-
struct drm_vma_offset_node *
drm_vma_offset_exact_lookup_locked
(struct drm_vma_offset_manager * mgr, unsigned long start, unsigned long pages)¶ Look up node by exact address
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
unsigned long start
- Start address (page-based, not byte-based)
unsigned long pages
- Size of object (page-based)
Description
Same as drm_vma_offset_lookup_locked()
but does not allow any offset into the node.
It only returns the exact object with the given start address.
Return
Node at exact start address start.
-
void
drm_vma_offset_lock_lookup
(struct drm_vma_offset_manager * mgr)¶ Lock lookup for extended private use
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
Description
Lock VMA manager for extended lookups. Only locked VMA function calls
are allowed while holding this lock. All other contexts are blocked from VMA
until the lock is released via drm_vma_offset_unlock_lookup()
.
Use this if you need to take a reference to the objects returned by
drm_vma_offset_lookup_locked()
before releasing this lock again.
This lock must not be used for anything else than extended lookups. You must not call any other VMA helpers while holding this lock.
Note
You’re in atomic-context while holding this lock!
-
void
drm_vma_offset_unlock_lookup
(struct drm_vma_offset_manager * mgr)¶ Unlock lookup for extended private use
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
Description
Release lookup-lock. See drm_vma_offset_lock_lookup()
for more information.
-
void
drm_vma_node_reset
(struct drm_vma_offset_node * node)¶ Initialize or reset node object
Parameters
struct drm_vma_offset_node * node
- Node to initialize or reset
Description
Reset a node to its initial state. This must be called before using it with any VMA offset manager.
This must not be called on an already allocated node, or you will leak memory.
-
unsigned long
drm_vma_node_start
(const struct drm_vma_offset_node * node)¶ Return start address for page-based addressing
Parameters
const struct drm_vma_offset_node * node
- Node to inspect
Description
Return the start address of the given node. This can be used as offset into
the linear VM space that is provided by the VMA offset manager. Note that
this can only be used for page-based addressing. If you need a proper offset
for user-space mappings, you must apply “<< PAGE_SHIFT” or use the
drm_vma_node_offset_addr()
helper instead.
Return
Start address of node for page-based addressing. 0 if the node does not have an offset allocated.
-
unsigned long
drm_vma_node_size
(struct drm_vma_offset_node * node)¶ Return size (page-based)
Parameters
struct drm_vma_offset_node * node
- Node to inspect
Description
Return the size as number of pages for the given node. This is the same size
that was passed to drm_vma_offset_add()
. If no offset is allocated for the
node, this is 0.
Return
Size of node as number of pages. 0 if the node does not have an offset allocated.
-
__u64
drm_vma_node_offset_addr
(struct drm_vma_offset_node * node)¶ Return sanitized offset for user-space mmaps
Parameters
struct drm_vma_offset_node * node
- Linked offset node
Description
Same as drm_vma_node_start()
but returns the address as a valid offset that
can be used for user-space mappings during mmap()
.
This must not be called on unlinked nodes.
Return
Offset of node for byte-based addressing. 0 if the node does not have an object allocated.
-
void
drm_vma_node_unmap
(struct drm_vma_offset_node * node, struct address_space * file_mapping)¶ Unmap offset node
Parameters
struct drm_vma_offset_node * node
- Offset node
struct address_space * file_mapping
- Address space to unmap node from
Description
Unmap all userspace mappings for a given offset node. The mappings must be associated with the file_mapping address-space. If no offset exists nothing is done.
This call is unlocked. The caller must guarantee that drm_vma_offset_remove()
is not called on this node concurrently.
-
int
drm_vma_node_verify_access
(struct drm_vma_offset_node * node, struct drm_file * tag)¶ Access verification helper for TTM
Parameters
struct drm_vma_offset_node * node
- Offset node
struct drm_file * tag
- Tag of file to check
Description
This checks whether tag is granted access to node. It is the same as
drm_vma_node_is_allowed()
but suitable as drop-in helper for TTM
verify_access()
callbacks.
Return
0 if access is granted, -EACCES otherwise.
-
void
drm_vma_offset_manager_init
(struct drm_vma_offset_manager * mgr, unsigned long page_offset, unsigned long size)¶ Initialize new offset-manager
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
unsigned long page_offset
- Offset of available memory area (page-based)
unsigned long size
- Size of available address space range (page-based)
Description
Initialize a new offset-manager. The offset and area size available for the manager are given as page_offset and size. Both are interpreted as page-numbers, not bytes.
Adding/removing nodes from the manager is locked internally and protected against concurrent access. However, node allocation and destruction is left for the caller. While calling into the vma-manager, a given node must always be guaranteed to be referenced.
-
void
drm_vma_offset_manager_destroy
(struct drm_vma_offset_manager * mgr)¶ Destroy offset manager
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
Description
Destroy an object manager which was previously created via
drm_vma_offset_manager_init()
. The caller must remove all allocated nodes
before destroying the manager. Otherwise, drm_mm will refuse to free the
requested resources.
The manager must not be accessed after this function is called.
-
struct drm_vma_offset_node *
drm_vma_offset_lookup_locked
(struct drm_vma_offset_manager * mgr, unsigned long start, unsigned long pages)¶ Find node in offset space
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
unsigned long start
- Start address for object (page-based)
unsigned long pages
- Size of object (page-based)
Description
Find a node given a start address and object size. This returns the _best_ match for the given node. That is, start may point somewhere into a valid region and the given node will be returned, as long as the node spans the whole requested area (given the size in number of pages as pages).
Note that before lookup the vma offset manager lookup lock must be acquired
with drm_vma_offset_lock_lookup()
. See there for an example. This can then be
used to implement weakly referenced lookups using kref_get_unless_zero()
.
Example
drm_vma_offset_lock_lookup(mgr);
node = drm_vma_offset_lookup_locked(mgr);
if (node)
kref_get_unless_zero(container_of(node, sth, entr));
drm_vma_offset_unlock_lookup(mgr);
Return
Returns NULL if no suitable node can be found. Otherwise, the best match is returned. It’s the caller’s responsibility to make sure the node doesn’t get destroyed before the caller can access it.
-
int
drm_vma_offset_add
(struct drm_vma_offset_manager * mgr, struct drm_vma_offset_node * node, unsigned long pages)¶ Add offset node to manager
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
struct drm_vma_offset_node * node
- Node to be added
unsigned long pages
- Allocation size visible to user-space (in number of pages)
Description
Add a node to the offset-manager. If the node was already added, this does nothing and return 0. pages is the size of the object given in number of pages. After this call succeeds, you can access the offset of the node until it is removed again.
If this call fails, it is safe to retry the operation or call
drm_vma_offset_remove()
, anyway. However, no cleanup is required in that
case.
pages is not required to be the same size as the underlying memory object that you want to map. It only limits the size that user-space can map into their address space.
Return
0 on success, negative error code on failure.
-
void
drm_vma_offset_remove
(struct drm_vma_offset_manager * mgr, struct drm_vma_offset_node * node)¶ Remove offset node from manager
Parameters
struct drm_vma_offset_manager * mgr
- Manager object
struct drm_vma_offset_node * node
- Node to be removed
Description
Remove a node from the offset manager. If the node wasn’t added before, this
does nothing. After this call returns, the offset and size will be 0 until a
new offset is allocated via drm_vma_offset_add()
again. Helper functions like
drm_vma_node_start()
and drm_vma_node_offset_addr()
will return 0 if no
offset is allocated.
-
int
drm_vma_node_allow
(struct drm_vma_offset_node * node, struct drm_file * tag)¶ Add open-file to list of allowed users
Parameters
struct drm_vma_offset_node * node
- Node to modify
struct drm_file * tag
- Tag of file to remove
Description
Add tag to the list of allowed open-files for this node. If tag is already on this list, the ref-count is incremented.
The list of allowed-users is preserved across drm_vma_offset_add()
and
drm_vma_offset_remove()
calls. You may even call it if the node is currently
not added to any offset-manager.
You must remove all open-files the same number of times as you added them before destroying the node. Otherwise, you will leak memory.
This is locked against concurrent access internally.
Return
0 on success, negative error code on internal failure (out-of-mem)
-
int
drm_vma_node_allow_once
(struct drm_vma_offset_node * node, struct drm_file * tag)¶ Add open-file to list of allowed users
Parameters
struct drm_vma_offset_node * node
- Node to modify
struct drm_file * tag
- Tag of file to remove
Description
Add tag to the list of allowed open-files for this node.
The list of allowed-users is preserved across drm_vma_offset_add()
and
drm_vma_offset_remove()
calls. You may even call it if the node is currently
not added to any offset-manager.
This is not ref-counted unlike drm_vma_node_allow()
hence drm_vma_node_revoke()
should only be called once after this.
This is locked against concurrent access internally.
Return
0 on success, negative error code on internal failure (out-of-mem)
-
void
drm_vma_node_revoke
(struct drm_vma_offset_node * node, struct drm_file * tag)¶ Remove open-file from list of allowed users
Parameters
struct drm_vma_offset_node * node
- Node to modify
struct drm_file * tag
- Tag of file to remove
Description
Decrement the ref-count of tag in the list of allowed open-files on node.
If the ref-count drops to zero, remove tag from the list. You must call
this once for every drm_vma_node_allow()
on tag.
This is locked against concurrent access internally.
If tag is not on the list, nothing is done.
-
bool
drm_vma_node_is_allowed
(struct drm_vma_offset_node * node, struct drm_file * tag)¶ Check whether an open-file is granted access
Parameters
struct drm_vma_offset_node * node
- Node to check
struct drm_file * tag
- Tag of file to remove
Description
Search the list in node whether tag is currently on the list of allowed
open-files (see drm_vma_node_allow()
).
This is locked against concurrent access internally.
Return
true if filp is on the list
PRIME Buffer Sharing¶
PRIME is the cross device buffer sharing framework in drm, originally created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME buffers are dma-buf based file descriptors.
Overview and Driver Interface¶
Similar to GEM global names, PRIME file descriptors are also used to share buffer objects across processes. They offer additional security: as file descriptors must be explicitly sent over UNIX domain sockets to be shared between applications, they can’t be guessed like the globally unique GEM names.
Drivers that support the PRIME API must set the DRIVER_PRIME bit in the
struct struct drm_driver
driver_features field, and implement the prime_handle_to_fd and
prime_fd_to_handle operations.
int (*prime_handle_to_fd)(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd); int (*prime_fd_to_handle)(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle); Those two operations convert a handle to a PRIME file descriptor and vice versa. Drivers must use the kernel dma-buf buffer sharing framework to manage the PRIME file descriptors. Similar to the mode setting API PRIME is agnostic to the underlying buffer object manager, as long as handles are 32bit unsigned integers.
While non-GEM drivers must implement the operations themselves, GEM
drivers must use the drm_gem_prime_handle_to_fd()
and
drm_gem_prime_fd_to_handle()
helper functions. Those
helpers rely on the driver gem_prime_export and gem_prime_import
operations to create a dma-buf instance from a GEM object (dma-buf
exporter role) and to create a GEM object from a dma-buf instance
(dma-buf importer role).
struct dma_buf * (*gem_prime_export)(struct drm_device *dev, struct drm_gem_object *obj, int flags); struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev, struct dma_buf *dma_buf); These two operations are mandatory for GEM drivers that support PRIME.
PRIME Helper Functions¶
Drivers can implement drm_gem_object_funcs.export
and
drm_driver.gem_prime_import
in terms of simpler APIs by using the helper
functions drm_gem_prime_export()
and drm_gem_prime_import()
. These functions
implement dma-buf support in terms of some lower-level helpers, which are
again exported for drivers to use individually:
Exporting buffers¶
Optional pinning of buffers is handled at dma-buf attach and detach time in
drm_gem_map_attach()
and drm_gem_map_detach()
. Backing storage itself is
handled by drm_gem_map_dma_buf()
and drm_gem_unmap_dma_buf()
, which relies on
drm_gem_object_funcs.get_sg_table
.
For kernel-internal access there’s drm_gem_dmabuf_vmap()
and
drm_gem_dmabuf_vunmap()
. Userspace mmap support is provided by
drm_gem_dmabuf_mmap()
.
Note that these export helpers can only be used if the underlying backing storage is fully coherent and either permanently pinned, or it is safe to pin it indefinitely.
FIXME: The underlying helper functions are named rather inconsistently.
Importing buffers¶
Importing dma-bufs using drm_gem_prime_import()
relies on
drm_driver.gem_prime_import_sg_table
.
Note that similarly to the export helpers this permanently pins the underlying backing storage. Which is ok for scanout, but is not the best option for sharing lots of buffers for rendering.
PRIME Function References¶
-
struct
drm_prime_file_private
¶ per-file tracking for PRIME
Definition
struct drm_prime_file_private {
};
Members
Description
This just contains the internal struct dma_buf
and handle caches for each
struct drm_file
used by the PRIME core code.
-
struct dma_buf *
drm_gem_dmabuf_export
(struct drm_device * dev, struct dma_buf_export_info * exp_info)¶ dma_buf
export implementation for GEM
Parameters
struct drm_device * dev
- parent device for the exported dmabuf
struct dma_buf_export_info * exp_info
- the export information used by
dma_buf_export()
Description
This wraps dma_buf_export()
for use by generic GEM drivers that are using
drm_gem_dmabuf_release()
. In addition to calling dma_buf_export()
, we take
a reference to the drm_device
and the exported drm_gem_object
(stored in
dma_buf_export_info.priv
) which is released by drm_gem_dmabuf_release()
.
Returns the new dmabuf.
Parameters
struct dma_buf * dma_buf
- buffer to be released
Description
Generic release function for dma_bufs exported as PRIME buffers. GEM drivers
must use this in their dma_buf_ops
structure as the release callback.
drm_gem_dmabuf_release()
should be used in conjunction with
drm_gem_dmabuf_export()
.
-
int
drm_gem_prime_fd_to_handle
(struct drm_device * dev, struct drm_file * file_priv, int prime_fd, uint32_t * handle)¶ PRIME import function for GEM drivers
Parameters
struct drm_device * dev
- dev to export the buffer from
struct drm_file * file_priv
- drm file-private structure
int prime_fd
- fd id of the dma-buf which should be imported
uint32_t * handle
- pointer to storage for the handle of the imported buffer object
Description
This is the PRIME import function which must be used mandatorily by GEM
drivers to ensure correct lifetime management of the underlying GEM object.
The actual importing of GEM object from the dma-buf is done through the
drm_driver.gem_prime_import
driver callback.
Returns 0 on success or a negative error code on failure.
-
int
drm_gem_prime_handle_to_fd
(struct drm_device * dev, struct drm_file * file_priv, uint32_t handle, uint32_t flags, int * prime_fd)¶ PRIME export function for GEM drivers
Parameters
struct drm_device * dev
- dev to export the buffer from
struct drm_file * file_priv
- drm file-private structure
uint32_t handle
- buffer handle to export
uint32_t flags
- flags like DRM_CLOEXEC
int * prime_fd
- pointer to storage for the fd id of the create dma-buf
Description
This is the PRIME export function which must be used mandatorily by GEM
drivers to ensure correct lifetime management of the underlying GEM object.
The actual exporting from GEM object to a dma-buf is done through the
drm_gem_object_funcs.export
callback.
-
int
drm_gem_map_attach
(struct dma_buf * dma_buf, struct dma_buf_attachment * attach)¶ dma_buf attach implementation for GEM
Parameters
struct dma_buf * dma_buf
- buffer to attach device to
struct dma_buf_attachment * attach
- buffer attachment data
Description
Calls drm_gem_object_funcs.pin
for device specific handling. This can be
used as the dma_buf_ops.attach
callback. Must be used together with
drm_gem_map_detach()
.
Returns 0 on success, negative error code on failure.
-
void
drm_gem_map_detach
(struct dma_buf * dma_buf, struct dma_buf_attachment * attach)¶ dma_buf detach implementation for GEM
Parameters
struct dma_buf * dma_buf
- buffer to detach from
struct dma_buf_attachment * attach
- attachment to be detached
Description
Calls drm_gem_object_funcs.pin
for device specific handling. Cleans up
dma_buf_attachment
from drm_gem_map_attach()
. This can be used as the
dma_buf_ops.detach
callback.
-
struct sg_table *
drm_gem_map_dma_buf
(struct dma_buf_attachment * attach, enum dma_data_direction dir)¶ map_dma_buf implementation for GEM
Parameters
struct dma_buf_attachment * attach
- attachment whose scatterlist is to be returned
enum dma_data_direction dir
- direction of DMA transfer
Description
Calls drm_gem_object_funcs.get_sg_table
and then maps the scatterlist. This
can be used as the dma_buf_ops.map_dma_buf
callback. Should be used together
with drm_gem_unmap_dma_buf()
.
Return
sg_table containing the scatterlist to be returned; returns ERR_PTR on error. May return -EINTR if it is interrupted by a signal.
-
void
drm_gem_unmap_dma_buf
(struct dma_buf_attachment * attach, struct sg_table * sgt, enum dma_data_direction dir)¶ unmap_dma_buf implementation for GEM
Parameters
struct dma_buf_attachment * attach
- attachment to unmap buffer from
struct sg_table * sgt
- scatterlist info of the buffer to unmap
enum dma_data_direction dir
- direction of DMA transfer
Description
This can be used as the dma_buf_ops.unmap_dma_buf
callback.
-
int
drm_gem_dmabuf_vmap
(struct dma_buf * dma_buf, struct iosys_map * map)¶ dma_buf vmap implementation for GEM
Parameters
struct dma_buf * dma_buf
- buffer to be mapped
struct iosys_map * map
- the virtual address of the buffer
Description
Sets up a kernel virtual mapping. This can be used as the dma_buf_ops.vmap
callback. Calls into drm_gem_object_funcs.vmap
for device specific handling.
The kernel virtual address is returned in map.
Returns 0 on success or a negative errno code otherwise.
-
void
drm_gem_dmabuf_vunmap
(struct dma_buf * dma_buf, struct iosys_map * map)¶ dma_buf vunmap implementation for GEM
Parameters
struct dma_buf * dma_buf
- buffer to be unmapped
struct iosys_map * map
- the virtual address of the buffer
Description
Releases a kernel virtual mapping. This can be used as the
dma_buf_ops.vunmap
callback. Calls into drm_gem_object_funcs.vunmap
for device specific handling.
-
int
drm_gem_prime_mmap
(struct drm_gem_object * obj, struct vm_area_struct * vma)¶ PRIME mmap function for GEM drivers
Parameters
struct drm_gem_object * obj
- GEM object
struct vm_area_struct * vma
- Virtual address range
Description
This function sets up a userspace mapping for PRIME exported buffers using
the same codepath that is used for regular GEM buffer mapping on the DRM fd.
The fake GEM offset is added to vma->vm_pgoff and drm_driver->fops
->mmap is
called to set up the mapping.
Drivers can use this as their drm_driver.gem_prime_mmap
callback.
-
int
drm_gem_dmabuf_mmap
(struct dma_buf * dma_buf, struct vm_area_struct * vma)¶ dma_buf mmap implementation for GEM
Parameters
struct dma_buf * dma_buf
- buffer to be mapped
struct vm_area_struct * vma
- virtual address range
Description
Provides memory mapping for the buffer. This can be used as the
dma_buf_ops.mmap
callback. It just forwards to drm_driver.gem_prime_mmap
,
which should be set to drm_gem_prime_mmap()
.
FIXME: There’s really no point to this wrapper, drivers which need anything
else but drm_gem_prime_mmap can roll their own dma_buf_ops.mmap
callback.
Returns 0 on success or a negative error code on failure.
-
struct sg_table *
drm_prime_pages_to_sg
(struct drm_device * dev, struct page ** pages, unsigned int nr_pages)¶ converts a page array into an sg list
Parameters
struct drm_device * dev
- DRM device
struct page ** pages
- pointer to the array of page pointers to convert
unsigned int nr_pages
- length of the page vector
Description
This helper creates an sg table object from a set of pages the driver is responsible for mapping the pages into the importers address space for use with dma_buf itself.
This is useful for implementing drm_gem_object_funcs.get_sg_table
.
-
unsigned long
drm_prime_get_contiguous_size
(struct sg_table * sgt)¶ returns the contiguous size of the buffer
Parameters
struct sg_table * sgt
- sg_table describing the buffer to check
Description
This helper calculates the contiguous size in the DMA address space of the buffer described by the provided sg_table.
This is useful for implementing
drm_gem_object_funcs.gem_prime_import_sg_table
.
-
struct dma_buf *
drm_gem_prime_export
(struct drm_gem_object * obj, int flags)¶ helper library implementation of the export callback
Parameters
struct drm_gem_object * obj
- GEM object to export
int flags
- flags like DRM_CLOEXEC and DRM_RDWR
Description
This is the implementation of the drm_gem_object_funcs.export
functions for GEM drivers
using the PRIME helpers. It is used as the default in
drm_gem_prime_handle_to_fd()
.
-
struct drm_gem_object *
drm_gem_prime_import_dev
(struct drm_device * dev, struct dma_buf * dma_buf, struct device * attach_dev)¶ core implementation of the import callback
Parameters
struct drm_device * dev
- drm_device to import into
struct dma_buf * dma_buf
- dma-buf object to import
struct device * attach_dev
- struct device to dma_buf attach
Description
This is the core of drm_gem_prime_import()
. It’s designed to be called by
drivers who want to use a different device structure than drm_device.dev
for
attaching via dma_buf. This function calls
drm_driver.gem_prime_import_sg_table
internally.
Drivers must arrange to call drm_prime_gem_destroy()
from their
drm_gem_object_funcs.free
hook when using this function.
-
struct drm_gem_object *
drm_gem_prime_import
(struct drm_device * dev, struct dma_buf * dma_buf)¶ helper library implementation of the import callback
Parameters
struct drm_device * dev
- drm_device to import into
struct dma_buf * dma_buf
- dma-buf object to import
Description
This is the implementation of the gem_prime_import functions for GEM drivers
using the PRIME helpers. Drivers can use this as their
drm_driver.gem_prime_import
implementation. It is used as the default
implementation in drm_gem_prime_fd_to_handle()
.
Drivers must arrange to call drm_prime_gem_destroy()
from their
drm_gem_object_funcs.free
hook when using this function.
-
int __deprecated
drm_prime_sg_to_page_array
(struct sg_table * sgt, struct page ** pages, int max_entries)¶ convert an sg table into a page array
Parameters
struct sg_table * sgt
- scatter-gather table to convert
struct page ** pages
- array of page pointers to store the pages in
int max_entries
- size of the passed-in array
Description
Exports an sg table into an array of pages.
This function is deprecated and strongly discouraged to be used. The page array is only useful for page faults and those can corrupt fields in the struct page if they are not handled by the exporting driver.
-
int
drm_prime_sg_to_dma_addr_array
(struct sg_table * sgt, dma_addr_t * addrs, int max_entries)¶ convert an sg table into a dma addr array
Parameters
struct sg_table * sgt
- scatter-gather table to convert
dma_addr_t * addrs
- array to store the dma bus address of each page
int max_entries
- size of both the passed-in arrays
Description
Exports an sg table into an array of addresses.
Drivers should use this in their drm_driver.gem_prime_import_sg_table
implementation.
-
void
drm_prime_gem_destroy
(struct drm_gem_object * obj, struct sg_table * sg)¶ helper to clean up a PRIME-imported GEM object
Parameters
struct drm_gem_object * obj
- GEM object which was created from a dma-buf
struct sg_table * sg
- the sg-table which was pinned at import time
Description
This is the cleanup functions which GEM drivers need to call when they use
drm_gem_prime_import()
or drm_gem_prime_import_dev()
to import dma-bufs.
DRM MM Range Allocator¶
Overview¶
drm_mm provides a simple range allocator. The drivers are free to use the resource allocator from the linux core if it suits them, the upside of drm_mm is that it’s in the DRM core. Which means that it’s easier to extend for some of the crazier special purpose needs of gpus.
The main data struct is drm_mm
, allocations are tracked in drm_mm_node
.
Drivers are free to embed either of them into their own suitable
datastructures. drm_mm itself will not do any memory allocations of its own,
so if drivers choose not to embed nodes they need to still allocate them
themselves.
The range allocator also supports reservation of preallocated blocks. This is useful for taking over initial mode setting configurations from the firmware, where an object needs to be created which exactly matches the firmware’s scanout target. As long as the range is still free it can be inserted anytime after the allocator is initialized, which helps with avoiding looped dependencies in the driver load sequence.
drm_mm maintains a stack of most recently freed holes, which of all simplistic datastructures seems to be a fairly decent approach to clustering allocations and avoiding too much fragmentation. This means free space searches are O(num_holes). Given that all the fancy features drm_mm supports something better would be fairly complex and since gfx thrashing is a fairly steep cliff not a real concern. Removing a node again is O(1).
drm_mm supports a few features: Alignment and range restrictions can be
supplied. Furthermore every drm_mm_node
has a color value (which is just an
opaque unsigned long) which in conjunction with a driver callback can be used
to implement sophisticated placement restrictions. The i915 DRM driver uses
this to implement guard pages between incompatible caching domains in the
graphics TT.
Two behaviors are supported for searching and allocating: bottom-up and top-down. The default is bottom-up. Top-down allocation can be used if the memory area has different restrictions, or just to reduce fragmentation.
Finally iteration helpers to walk all nodes and all holes are provided as are some basic allocator dumpers for debugging.
Note that this range allocator is not thread-safe, drivers need to protect modifications with their own locking. The idea behind this is that for a full memory manager additional data needs to be protected anyway, hence internal locking would be fully redundant.
LRU Scan/Eviction Support¶
Very often GPUs need to have continuous allocations for a given object. When evicting objects to make space for a new one it is therefore not most efficient when we simply start to select all objects from the tail of an LRU until there’s a suitable hole: Especially for big objects or nodes that otherwise have special allocation constraints there’s a good chance we evict lots of (smaller) objects unnecessarily.
The DRM range allocator supports this use-case through the scanning
interfaces. First a scan operation needs to be initialized with
drm_mm_scan_init()
or drm_mm_scan_init_with_range()
. The driver adds
objects to the roster, probably by walking an LRU list, but this can be
freely implemented. Eviction candidates are added using
drm_mm_scan_add_block()
until a suitable hole is found or there are no
further evictable objects. Eviction roster metadata is tracked in struct
drm_mm_scan
.
The driver must walk through all objects again in exactly the reverse order to restore the allocator state. Note that while the allocator is used in the scan mode no other operation is allowed.
Finally the driver evicts all objects selected (drm_mm_scan_remove_block()
reported true) in the scan, and any overlapping nodes after color adjustment
(drm_mm_scan_color_evict()
). Adding and removing an object is O(1), and
since freeing a node is also O(1) the overall complexity is
O(scanned_objects). So like the free stack which needs to be walked before a
scan operation even begins this is linear in the number of objects. It
doesn’t seem to hurt too badly.
DRM MM Range Allocator Function References¶
-
enum
drm_mm_insert_mode
¶ control search and allocation behaviour
Constants
DRM_MM_INSERT_BEST
Search for the smallest hole (within the search range) that fits the desired node.
Allocates the node from the bottom of the found hole.
DRM_MM_INSERT_LOW
Search for the lowest hole (address closest to 0, within the search range) that fits the desired node.
Allocates the node from the bottom of the found hole.
DRM_MM_INSERT_HIGH
Search for the highest hole (address closest to U64_MAX, within the search range) that fits the desired node.
Allocates the node from the top of the found hole. The specified alignment for the node is applied to the base of the node (
drm_mm_node.start
).DRM_MM_INSERT_EVICT
Search for the most recently evicted hole (within the search range) that fits the desired node. This is appropriate for use immediately after performing an eviction scan (see
drm_mm_scan_init()
) and removing the selected nodes to form a hole.Allocates the node from the bottom of the found hole.
DRM_MM_INSERT_ONCE
- Only check the first hole for suitablity and report -ENOSPC immediately otherwise, rather than check every hole until a suitable one is found. Can only be used in conjunction with another search method such as DRM_MM_INSERT_HIGH or DRM_MM_INSERT_LOW.
DRM_MM_INSERT_HIGHEST
Only check the highest hole (the hole with the largest address) and insert the node at the top of the hole or report -ENOSPC if unsuitable.
Does not search all holes.
DRM_MM_INSERT_LOWEST
Only check the lowest hole (the hole with the smallest address) and insert the node at the bottom of the hole or report -ENOSPC if unsuitable.
Does not search all holes.
Description
The struct drm_mm
range manager supports finding a suitable modes using
a number of search trees. These trees are oranised by size, by address and
in most recent eviction order. This allows the user to find either the
smallest hole to reuse, the lowest or highest address to reuse, or simply
reuse the most recent eviction that fits. When allocating the drm_mm_node
from within the hole, the drm_mm_insert_mode
also dictate whether to
allocate the lowest matching address or the highest.
-
struct
drm_mm_node
¶ allocated block in the DRM allocator
Definition
struct drm_mm_node {
unsigned long color;
u64 start;
u64 size;
};
Members
color
- Opaque driver-private tag.
start
- Start address of the allocated block.
size
- Size of the allocated block.
Description
This represents an allocated block in a drm_mm
allocator. Except for
pre-reserved nodes inserted using drm_mm_reserve_node()
the structure is
entirely opaque and should only be accessed through the provided funcions.
Since allocation of these nodes is entirely handled by the driver they can be
embedded.
-
struct
drm_mm
¶ DRM allocator
Definition
struct drm_mm {
void (*color_adjust)(const struct drm_mm_node *node,unsigned long color, u64 *start, u64 *end);
};
Members
color_adjust
- Optional driver callback to further apply restrictions on a hole. The
node argument points at the node containing the hole from which the
block would be allocated (see
drm_mm_hole_follows()
and friends). The other arguments are the size of the block to be allocated. The driver can adjust the start and end as needed to e.g. insert guard pages.
Description
DRM range allocator with a few special functions and features geared towards managing GPU memory. Except for the color_adjust callback the structure is entirely opaque and should only be accessed through the provided functions and macros. This structure can be embedded into larger driver structures.
-
struct
drm_mm_scan
¶ DRM allocator eviction roaster data
Definition
struct drm_mm_scan {
};
Members
Description
This structure tracks data needed for the eviction roaster set up using
drm_mm_scan_init()
, and used with drm_mm_scan_add_block()
and
drm_mm_scan_remove_block()
. The structure is entirely opaque and should only
be accessed through the provided functions and macros. It is meant to be
allocated temporarily by the driver on the stack.
-
bool
drm_mm_node_allocated
(const struct drm_mm_node * node)¶ checks whether a node is allocated
Parameters
const struct drm_mm_node * node
- drm_mm_node to check
Description
Drivers are required to clear a node prior to using it with the drm_mm range manager.
Drivers should use this helper for proper encapsulation of drm_mm internals.
Return
True if the node is allocated.
Parameters
const struct drm_mm * mm
- drm_mm to check
Description
Drivers should clear the struct drm_mm prior to initialisation if they want to use this function.
Drivers should use this helper for proper encapsulation of drm_mm internals.
Return
True if the mm is initialized.
-
bool
drm_mm_hole_follows
(const struct drm_mm_node * node)¶ checks whether a hole follows this node
Parameters
const struct drm_mm_node * node
- drm_mm_node to check
Description
Holes are embedded into the drm_mm using the tail of a drm_mm_node.
If you wish to know whether a hole follows this particular node,
query this function. See also drm_mm_hole_node_start()
and
drm_mm_hole_node_end()
.
Return
True if a hole follows the node.
-
u64
drm_mm_hole_node_start
(const struct drm_mm_node * hole_node)¶ computes the start of the hole following node
Parameters
const struct drm_mm_node * hole_node
- drm_mm_node which implicitly tracks the following hole
Description
This is useful for driver-specific debug dumpers. Otherwise drivers should
not inspect holes themselves. Drivers must check first whether a hole indeed
follows by looking at drm_mm_hole_follows()
Return
Start of the subsequent hole.
-
u64
drm_mm_hole_node_end
(const struct drm_mm_node * hole_node)¶ computes the end of the hole following node
Parameters
const struct drm_mm_node * hole_node
- drm_mm_node which implicitly tracks the following hole
Description
This is useful for driver-specific debug dumpers. Otherwise drivers should
not inspect holes themselves. Drivers must check first whether a hole indeed
follows by looking at drm_mm_hole_follows()
.
Return
End of the subsequent hole.
-
drm_mm_nodes
(mm)¶ list of nodes under the drm_mm range manager
Parameters
mm
- the struct drm_mm range manager
Description
As the drm_mm range manager hides its node_list deep with its
structure, extracting it looks painful and repetitive. This is
not expected to be used outside of the drm_mm_for_each_node()
macros and similar internal functions.
Return
The node list, may be empty.
-
drm_mm_for_each_node
(entry, mm)¶ iterator to walk over all allocated nodes
Parameters
entry
struct drm_mm_node
to assign to in each iteration stepmm
drm_mm
allocator to walk
Description
This iterator walks over all nodes in the range allocator. It is implemented
with list_for_each()
, so not save against removal of elements.
-
drm_mm_for_each_node_safe
(entry, next, mm)¶ iterator to walk over all allocated nodes
Parameters
entry
struct drm_mm_node
to assign to in each iteration stepnext
struct drm_mm_node
to store the next stepmm
drm_mm
allocator to walk
Description
This iterator walks over all nodes in the range allocator. It is implemented
with list_for_each_safe()
, so save against removal of elements.
-
drm_mm_for_each_hole
(pos, mm, hole_start, hole_end)¶ iterator to walk over all holes
Parameters
pos
drm_mm_node
used internally to track progressmm
drm_mm
allocator to walkhole_start
- ulong variable to assign the hole start to on each iteration
hole_end
- ulong variable to assign the hole end to on each iteration
Description
This iterator walks over all holes in the range allocator. It is implemented
with list_for_each()
, so not save against removal of elements. entry is used
internally and will not reflect a real drm_mm_node for the very first hole.
Hence users of this iterator may not access it.
Implementation Note: We need to inline list_for_each_entry in order to be able to set hole_start and hole_end on each iteration while keeping the macro sane.
-
int
drm_mm_insert_node_generic
(struct drm_mm * mm, struct drm_mm_node * node, u64 size, u64 alignment, unsigned long color, enum drm_mm_insert_mode mode)¶ search for space and insert node
Parameters
struct drm_mm * mm
- drm_mm to allocate from
struct drm_mm_node * node
- preallocate node to insert
u64 size
- size of the allocation
u64 alignment
- alignment of the allocation
unsigned long color
- opaque tag value to use for this node
enum drm_mm_insert_mode mode
- fine-tune the allocation search and placement
Description
This is a simplified version of drm_mm_insert_node_in_range()
with no
range restrictions applied.
The preallocated node must be cleared to 0.
Return
0 on success, -ENOSPC if there’s no suitable hole.
-
int
drm_mm_insert_node
(struct drm_mm * mm, struct drm_mm_node * node, u64 size)¶ search for space and insert node
Parameters
struct drm_mm * mm
- drm_mm to allocate from
struct drm_mm_node * node
- preallocate node to insert
u64 size
- size of the allocation
Description
This is a simplified version of drm_mm_insert_node_generic()
with color set
to 0.
The preallocated node must be cleared to 0.
Return
0 on success, -ENOSPC if there’s no suitable hole.
Parameters
const struct drm_mm * mm
- drm_mm allocator to check
Return
True if the allocator is completely free, false if there’s still a node allocated in it.
-
drm_mm_for_each_node_in_range
(node__, mm__, start__, end__)¶ iterator to walk over a range of allocated nodes
Parameters
node__
- drm_mm_node structure to assign to in each iteration step
mm__
- drm_mm allocator to walk
start__
- starting offset, the first node will overlap this
end__
- ending offset, the last node will start before this (but may overlap)
Description
This iterator walks over all nodes in the range allocator that lie
between start and end. It is implemented similarly to list_for_each()
,
but using the internal interval tree to accelerate the search for the
starting node, and so not safe against removal of elements. It assumes
that end is within (or is the upper limit of) the drm_mm allocator.
If [start, end] are beyond the range of the drm_mm, the iterator may walk
over the special _unallocated_ drm_mm.head_node
, and may even continue
indefinitely.
-
void
drm_mm_scan_init
(struct drm_mm_scan * scan, struct drm_mm * mm, u64 size, u64 alignment, unsigned long color, enum drm_mm_insert_mode mode)¶ initialize lru scanning
Parameters
struct drm_mm_scan * scan
- scan state
struct drm_mm * mm
- drm_mm to scan
u64 size
- size of the allocation
u64 alignment
- alignment of the allocation
unsigned long color
- opaque tag value to use for the allocation
enum drm_mm_insert_mode mode
- fine-tune the allocation search and placement
Description
This is a simplified version of drm_mm_scan_init_with_range()
with no range
restrictions applied.
This simply sets up the scanning routines with the parameters for the desired hole.
Warning: As long as the scan list is non-empty, no other operations than adding/removing nodes to/from the scan list are allowed.
-
int
drm_mm_reserve_node
(struct drm_mm * mm, struct drm_mm_node * node)¶ insert an pre-initialized node
Parameters
struct drm_mm * mm
- drm_mm allocator to insert node into
struct drm_mm_node * node
- drm_mm_node to insert
Description
This functions inserts an already set-up drm_mm_node
into the allocator,
meaning that start, size and color must be set by the caller. All other
fields must be cleared to 0. This is useful to initialize the allocator with
preallocated objects which must be set-up before the range allocator can be
set-up, e.g. when taking over a firmware framebuffer.
Return
0 on success, -ENOSPC if there’s no hole where node is.
-
int
drm_mm_insert_node_in_range
(struct drm_mm *const mm, struct drm_mm_node *const node, u64 size, u64 alignment, unsigned long color, u64 range_start, u64 range_end, enum drm_mm_insert_mode mode)¶ ranged search for space and insert node
Parameters
struct drm_mm *const mm
- drm_mm to allocate from
struct drm_mm_node *const node
- preallocate node to insert
u64 size
- size of the allocation
u64 alignment
- alignment of the allocation
unsigned long color
- opaque tag value to use for this node
u64 range_start
- start of the allowed range for this node
u64 range_end
- end of the allowed range for this node
enum drm_mm_insert_mode mode
- fine-tune the allocation search and placement
Description
The preallocated node must be cleared to 0.
Return
0 on success, -ENOSPC if there’s no suitable hole.
-
void
drm_mm_remove_node
(struct drm_mm_node * node)¶ Remove a memory node from the allocator.
Parameters
struct drm_mm_node * node
- drm_mm_node to remove
Description
This just removes a node from its drm_mm allocator. The node does not need to be cleared again before it can be re-inserted into this or any other drm_mm allocator. It is a bug to call this function on a unallocated node.
-
void
drm_mm_replace_node
(struct drm_mm_node * old, struct drm_mm_node * new)¶ move an allocation from old to new
Parameters
struct drm_mm_node * old
- drm_mm_node to remove from the allocator
struct drm_mm_node * new
- drm_mm_node which should inherit old’s allocation
Description
This is useful for when drivers embed the drm_mm_node structure and hence can’t move allocations by reassigning pointers. It’s a combination of remove and insert with the guarantee that the allocation start will match.
-
void
drm_mm_scan_init_with_range
(struct drm_mm_scan * scan, struct drm_mm * mm, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, enum drm_mm_insert_mode mode)¶ initialize range-restricted lru scanning
Parameters
struct drm_mm_scan * scan
- scan state
struct drm_mm * mm
- drm_mm to scan
u64 size
- size of the allocation
u64 alignment
- alignment of the allocation
unsigned long color
- opaque tag value to use for the allocation
u64 start
- start of the allowed range for the allocation
u64 end
- end of the allowed range for the allocation
enum drm_mm_insert_mode mode
- fine-tune the allocation search and placement
Description
This simply sets up the scanning routines with the parameters for the desired hole.
Warning: As long as the scan list is non-empty, no other operations than adding/removing nodes to/from the scan list are allowed.
-
bool
drm_mm_scan_add_block
(struct drm_mm_scan * scan, struct drm_mm_node * node)¶ add a node to the scan list
Parameters
struct drm_mm_scan * scan
- the active drm_mm scanner
struct drm_mm_node * node
- drm_mm_node to add
Description
Add a node to the scan list that might be freed to make space for the desired hole.
Return
True if a hole has been found, false otherwise.
-
bool
drm_mm_scan_remove_block
(struct drm_mm_scan * scan, struct drm_mm_node * node)¶ remove a node from the scan list
Parameters
struct drm_mm_scan * scan
- the active drm_mm scanner
struct drm_mm_node * node
- drm_mm_node to remove
Description
Nodes must be removed in exactly the reverse order from the scan list as
they have been added (e.g. using list_add()
as they are added and then
list_for_each()
over that eviction list to remove), otherwise the internal
state of the memory manager will be corrupted.
When the scan list is empty, the selected memory nodes can be freed. An
immediately following drm_mm_insert_node_in_range_generic()
or one of the
simpler versions of that function with !DRM_MM_SEARCH_BEST will then return
the just freed block (because it’s at the top of the free_stack list).
Return
True if this block should be evicted, false otherwise. Will always return false when no hole has been found.
-
struct drm_mm_node *
drm_mm_scan_color_evict
(struct drm_mm_scan * scan)¶ evict overlapping nodes on either side of hole
Parameters
struct drm_mm_scan * scan
- drm_mm scan with target hole
Description
After completing an eviction scan and removing the selected nodes, we may need to remove a few more nodes from either side of the target hole if mm.color_adjust is being used.
Return
A node to evict, or NULL if there are no overlapping nodes.
Parameters
struct drm_mm * mm
- the drm_mm structure to initialize
u64 start
- start of the range managed by mm
u64 size
- end of the range managed by mm
Description
Note that mm must be cleared to 0 before calling this function.
Parameters
struct drm_mm * mm
- drm_mm allocator to clean up
Description
Note that it is a bug to call this function on an allocator which is not clean.
-
void
drm_mm_print
(const struct drm_mm * mm, struct drm_printer * p)¶ print allocator state
Parameters
const struct drm_mm * mm
- drm_mm allocator to print
struct drm_printer * p
- DRM printer to use
DRM Cache Handling¶
-
void
drm_clflush_pages
(struct page * pages, unsigned long num_pages)¶ Flush dcache lines of a set of pages.
Parameters
struct page * pages
- List of pages to be flushed.
unsigned long num_pages
- Number of pages in the array.
Description
Flush every data cache line entry that points to an address belonging to a page in the array.
-
void
drm_clflush_sg
(struct sg_table * st)¶ Flush dcache lines pointing to a scather-gather.
Parameters
struct sg_table * st
- struct sg_table.
Description
Flush every data cache line entry that points to an address in the sg.
-
void
drm_clflush_virt_range
(void * addr, unsigned long length)¶ Flush dcache lines of a region
Parameters
void * addr
- Initial kernel memory address.
unsigned long length
- Region size.
Description
Flush every data cache line entry that points to an address in the region requested.
-
void
drm_memcpy_from_wc
(struct iosys_map * dst, const struct iosys_map * src, unsigned long len)¶ Perform the fastest available memcpy from a source that may be WC.
Parameters
struct iosys_map * dst
- The destination pointer
const struct iosys_map * src
- The source pointer
unsigned long len
- The size of the area o transfer in bytes
Description
Tries an arch optimized memcpy for prefetching reading out of a WC region, and if no such beast is available, falls back to a normal memcpy.
DRM Sync Objects¶
DRM synchronisation objects (syncobj, see struct drm_syncobj
) provide a
container for a synchronization primitive which can be used by userspace
to explicitly synchronize GPU commands, can be shared between userspace
processes, and can be shared between different DRM drivers.
Their primary use-case is to implement Vulkan fences and semaphores.
The syncobj userspace API provides ioctls for several operations:
- Creation and destruction of syncobjs
- Import and export of syncobjs to/from a syncobj file descriptor
- Import and export a syncobj’s underlying fence to/from a sync file
- Reset a syncobj (set its fence to NULL)
- Signal a syncobj (set a trivially signaled fence)
- Wait for a syncobj’s fence to appear and be signaled
The syncobj userspace API also provides operations to manipulate a syncobj
in terms of a timeline of struct dma_fence_chain
rather than a single
struct dma_fence
, through the following operations:
- Signal a given point on the timeline
- Wait for a given point to appear and/or be signaled
- Import and export from/to a given point of a timeline
At it’s core, a syncobj is simply a wrapper around a pointer to a struct
dma_fence
which may be NULL.
When a syncobj is first created, its pointer is either NULL or a pointer
to an already signaled fence depending on whether the
DRM_SYNCOBJ_CREATE_SIGNALED
flag is passed to
DRM_IOCTL_SYNCOBJ_CREATE
.
If the syncobj is considered as a binary (its state is either signaled or
unsignaled) primitive, when GPU work is enqueued in a DRM driver to signal
the syncobj, the syncobj’s fence is replaced with a fence which will be
signaled by the completion of that work.
If the syncobj is considered as a timeline primitive, when GPU work is
enqueued in a DRM driver to signal the a given point of the syncobj, a new
struct dma_fence_chain
pointing to the DRM driver’s fence and also
pointing to the previous fence that was in the syncobj. The new struct
dma_fence_chain
fence replace the syncobj’s fence and will be signaled by
completion of the DRM driver’s work and also any work associated with the
fence previously in the syncobj.
When GPU work which waits on a syncobj is enqueued in a DRM driver, at the time the work is enqueued, it waits on the syncobj’s fence before submitting the work to hardware. That fence is either :
- The syncobj’s current fence if the syncobj is considered as a binary primitive.
- The struct
dma_fence
associated with a given point if the syncobj is considered as a timeline primitive.
If the syncobj’s fence is NULL or not present in the syncobj’s timeline, the enqueue operation is expected to fail.
With binary syncobj, all manipulation of the syncobjs’s fence happens in
terms of the current fence at the time the ioctl is called by userspace
regardless of whether that operation is an immediate host-side operation
(signal or reset) or or an operation which is enqueued in some driver
queue. DRM_IOCTL_SYNCOBJ_RESET
and DRM_IOCTL_SYNCOBJ_SIGNAL
can be used
to manipulate a syncobj from the host by resetting its pointer to NULL or
setting its pointer to a fence which is already signaled.
With a timeline syncobj, all manipulation of the synobj’s fence happens in
terms of a u64 value referring to point in the timeline. See
dma_fence_chain_find_seqno()
to see how a given point is found in the
timeline.
Note that applications should be careful to always use timeline set of
ioctl()
when dealing with syncobj considered as timeline. Using a binary
set of ioctl()
with a syncobj considered as timeline could result incorrect
synchronization. The use of binary syncobj is supported through the
timeline set of ioctl()
by using a point value of 0, this will reproduce
the behavior of the binary set of ioctl()
(for example replace the
syncobj’s fence when signaling).
Host-side wait on syncobjs¶
DRM_IOCTL_SYNCOBJ_WAIT
takes an array of syncobj handles and does a
host-side wait on all of the syncobj fences simultaneously.
If DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL
is set, the wait ioctl will wait on
all of the syncobj fences to be signaled before it returns.
Otherwise, it returns once at least one syncobj fence has been signaled
and the index of a signaled fence is written back to the client.
Unlike the enqueued GPU work dependencies which fail if they see a NULL
fence in a syncobj, if DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT
is set,
the host-side wait will first wait for the syncobj to receive a non-NULL
fence and then wait on that fence.
If DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT
is not set and any one of the
syncobjs in the array has a NULL fence, -EINVAL will be returned.
Assuming the syncobj starts off with a NULL fence, this allows a client
to do a host wait in one thread (or process) which waits on GPU work
submitted in another thread (or process) without having to manually
synchronize between the two.
This requirement is inherited from the Vulkan fence API.
Similarly, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT
takes an array of syncobj
handles as well as an array of u64 points and does a host-side wait on all
of syncobj fences at the given points simultaneously.
DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT
also adds the ability to wait for a given
fence to materialize on the timeline without waiting for the fence to be
signaled by using the DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
flag. This
requirement is inherited from the wait-before-signal behavior required by
the Vulkan timeline semaphore API.
Import/export of syncobjs¶
DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE
and DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD
provide two mechanisms for import/export of syncobjs.
The first lets the client import or export an entire syncobj to a file
descriptor.
These fd’s are opaque and have no other use case, except passing the
syncobj between processes.
All exported file descriptors and any syncobj handles created as a
result of importing those file descriptors own a reference to the
same underlying struct drm_syncobj
and the syncobj can be used
persistently across all the processes with which it is shared.
The syncobj is freed only once the last reference is dropped.
Unlike dma-buf, importing a syncobj creates a new handle (with its own
reference) for every import instead of de-duplicating.
The primary use-case of this persistent import/export is for shared
Vulkan fences and semaphores.
The second import/export mechanism, which is indicated by
DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE
or
DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE
lets the client
import/export the syncobj’s current fence from/to a sync_file
.
When a syncobj is exported to a sync file, that sync file wraps the
sycnobj’s fence at the time of export and any later signal or reset
operations on the syncobj will not affect the exported sync file.
When a sync file is imported into a syncobj, the syncobj’s fence is set
to the fence wrapped by that sync file.
Because sync files are immutable, resetting or signaling the syncobj
will not affect any sync files whose fences have been imported into the
syncobj.
Import/export of timeline points in timeline syncobjs¶
DRM_IOCTL_SYNCOBJ_TRANSFER
provides a mechanism to transfer a struct
dma_fence_chain
of a syncobj at a given u64 point to another u64 point
into another syncobj.
Note that if you want to transfer a struct dma_fence_chain
from a given
point on a timeline syncobj from/into a binary syncobj, you can use the
point 0 to mean take/replace the fence in the syncobj.
-
struct
drm_syncobj
¶ sync object.
Definition
struct drm_syncobj {
struct kref refcount;
struct dma_fence __rcu *fence;
struct list_head cb_list;
spinlock_t lock;
struct file *file;
};
Members
refcount
- Reference count of this object.
fence
NULL or a pointer to the fence bound to this object.
This field should not be used directly. Use
drm_syncobj_fence_get()
anddrm_syncobj_replace_fence()
instead.cb_list
- List of callbacks to call when the
fence
gets replaced. lock
- Protects
cb_list
and write-locksfence
. file
- A file backing for this syncobj.
Description
This structure defines a generic sync object which wraps a dma_fence
.
-
void
drm_syncobj_get
(struct drm_syncobj * obj)¶ acquire a syncobj reference
Parameters
struct drm_syncobj * obj
- sync object
Description
This acquires an additional reference to obj. It is illegal to call this without already holding a reference. No locks required.
-
void
drm_syncobj_put
(struct drm_syncobj * obj)¶ release a reference to a sync object.
Parameters
struct drm_syncobj * obj
- sync object.
-
struct dma_fence *
drm_syncobj_fence_get
(struct drm_syncobj * syncobj)¶ get a reference to a fence in a sync object
Parameters
struct drm_syncobj * syncobj
- sync object.
Description
This acquires additional reference to drm_syncobj.fence
contained in obj,
if not NULL. It is illegal to call this without already holding a reference.
No locks required.
Return
Either the fence of obj or NULL if there’s none.
-
struct drm_syncobj *
drm_syncobj_find
(struct drm_file * file_private, u32 handle)¶ lookup and reference a sync object.
Parameters
struct drm_file * file_private
- drm file private pointer
u32 handle
- sync object handle to lookup.
Description
Returns a reference to the syncobj pointed to by handle or NULL. The
reference must be released by calling drm_syncobj_put()
.
-
void
drm_syncobj_add_point
(struct drm_syncobj * syncobj, struct dma_fence_chain * chain, struct dma_fence * fence, uint64_t point)¶ add new timeline point to the syncobj
Parameters
struct drm_syncobj * syncobj
- sync object to add timeline point do
struct dma_fence_chain * chain
- chain node to use to add the point
struct dma_fence * fence
- fence to encapsulate in the chain node
uint64_t point
- sequence number to use for the point
Description
Add the chain node as new timeline point to the syncobj.
-
void
drm_syncobj_replace_fence
(struct drm_syncobj * syncobj, struct dma_fence * fence)¶ replace fence in a sync object.
Parameters
struct drm_syncobj * syncobj
- Sync object to replace fence in
struct dma_fence * fence
- fence to install in sync file.
Description
This replaces the fence on a sync object.
-
int
drm_syncobj_find_fence
(struct drm_file * file_private, u32 handle, u64 point, u64 flags, struct dma_fence ** fence)¶ lookup and reference the fence in a sync object
Parameters
struct drm_file * file_private
- drm file private pointer
u32 handle
- sync object handle to lookup.
u64 point
- timeline point
u64 flags
- DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT or not
struct dma_fence ** fence
- out parameter for the fence
Description
This is just a convenience function that combines drm_syncobj_find()
and
drm_syncobj_fence_get()
.
Returns 0 on success or a negative error value on failure. On success fence
contains a reference to the fence, which must be released by calling
dma_fence_put()
.
-
void
drm_syncobj_free
(struct kref * kref)¶ free a sync object.
Parameters
struct kref * kref
- kref to free.
Description
Only to be called from kref_put in drm_syncobj_put.
-
int
drm_syncobj_create
(struct drm_syncobj ** out_syncobj, uint32_t flags, struct dma_fence * fence)¶ create a new syncobj
Parameters
struct drm_syncobj ** out_syncobj
- returned syncobj
uint32_t flags
- DRM_SYNCOBJ_* flags
struct dma_fence * fence
- if non-NULL, the syncobj will represent this fence
Description
This is the first function to create a sync object. After creating, drivers
probably want to make it available to userspace, either through
drm_syncobj_get_handle()
or drm_syncobj_get_fd()
.
Returns 0 on success or a negative error value on failure.
-
int
drm_syncobj_get_handle
(struct drm_file * file_private, struct drm_syncobj * syncobj, u32 * handle)¶ get a handle from a syncobj
Parameters
struct drm_file * file_private
- drm file private pointer
struct drm_syncobj * syncobj
- Sync object to export
u32 * handle
- out parameter with the new handle
Description
Exports a sync object created with drm_syncobj_create()
as a handle on
file_private to userspace.
Returns 0 on success or a negative error value on failure.
-
int
drm_syncobj_get_fd
(struct drm_syncobj * syncobj, int * p_fd)¶ get a file descriptor from a syncobj
Parameters
struct drm_syncobj * syncobj
- Sync object to export
int * p_fd
- out parameter with the new file descriptor
Description
Exports a sync object created with drm_syncobj_create()
as a file descriptor.
Returns 0 on success or a negative error value on failure.
-
signed long
drm_timeout_abs_to_jiffies
(int64_t timeout_nsec)¶ calculate jiffies timeout from absolute value
Parameters
int64_t timeout_nsec
- timeout nsec component in ns, 0 for poll
Description
Calculate the timeout in jiffies from an absolute time in sec/nsec.