From 95844d20ae024b5d553c9923a0d3145c3956bf69 Mon Sep 17 00:00:00 2001 From: Marek Olšák Date: Wed, 17 Aug 2016 23:49:27 +0200 Subject: drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The old mechanism used a per-submission limit that didn't take previous submissions within the same time frame into account. It also filled VRAM slowly when VRAM usage dropped due to a big eviction or buffer deallocation. This new method establishes a configurable MBps limit that is obeyed when VRAM usage is very high. When VRAM usage is not very high, it gives the driver the freedom to fill it quickly. The result is more consistent performance. It can't keep the BO move rate low if lots of evictions are happening due to VRAM fragmentation, or if a big buffer is being migrated. The amdgpu.moverate parameter can be used to set a non-default limit. Measurements can be done to find out which amdgpu.moverate setting gives the best results. Mainly APUs and cards with small VRAM will benefit from this. For F1 2015, anything with 2 GB VRAM or less will benefit. Some benchmark results - F1 2015 (Tonga 2GB): Limit MinFPS AvgFPS Old code: 14 32.6 128 MB/s: 28 41 64 MB/s: 15.5 43 32 MB/s: 28.7 43.4 8 MB/s: 27.8 44.4 8 MB/s: 21.9 42.8 (different run) Random drops in Min FPS can still occur (due to fragmented VRAM?), but the average FPS is much better. 8 MB/s is probably a good limit for this game & the current VRAM management. The random FPS drops are still to be tackled. v2: use a spinlock Signed-off-by: Marek Olšák Acked-by: Christian König Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_device.c') diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 1ef4034b3be5..847583d8a3b3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1490,6 +1490,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, { int r, i; bool runtime = false; + u32 max_MBps; adev->shutdown = false; adev->dev = &pdev->dev; @@ -1549,6 +1550,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, spin_lock_init(&adev->didt_idx_lock); spin_lock_init(&adev->gc_cac_idx_lock); spin_lock_init(&adev->audio_endpt_idx_lock); + spin_lock_init(&adev->mm_stats.lock); INIT_LIST_HEAD(&adev->shadow_list); mutex_init(&adev->shadow_list_lock); @@ -1660,6 +1662,14 @@ int amdgpu_device_init(struct amdgpu_device *adev, adev->accel_working = true; + /* Initialize the buffer migration limit. */ + if (amdgpu_moverate >= 0) + max_MBps = amdgpu_moverate; + else + max_MBps = 8; /* Allow 8 MB/s. */ + /* Get a log2 for easy divisions. */ + adev->mm_stats.log2_max_MBps = ilog2(max(1u, max_MBps)); + amdgpu_fbdev_init(adev); r = amdgpu_ib_pool_init(adev); -- cgit v1.2.3