Discussion:
[PATCH v2 4/4] drm/vc4: Restrict active CTM to one CRTC
Daniel Stone
2018-03-25 08:01:18 UTC
Permalink
Hi Stefan,
+static int vc4_crtc_get_ctm_fifo(struct vc4_dev *vc4)
+{
+ return VC4_GET_FIELD(HVS_READ(SCALER_OLEDOFFS),
+ SCALER_OLEDOFFS_DISPFIFO);
+}
This needs to be managed as a global resource through atomic state
objects, rather than checking the current hardware state.

Cheers,
Daniel
Stefan Schake
2018-03-25 01:52:37 UTC
Permalink
From: Eric Anholt <***@anholt.net>

At least the RGBA expand field we should have been setting, because we
aren't expanding correctly for 565 -> 8888. Other registers are ones
that may be interesting for various projects that have been discussed.

Signed-off-by: Eric Anholt <***@anholt.net>
Cc: Stefan Schake <***@gmail.com>
---
v2: New in the series. Included here to keep kbuild robot happy and give
the full picture.

drivers/gpu/drm/vc4/vc4_regs.h | 96 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 96 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index a141496104a6..4af3e29d076a 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -330,6 +330,21 @@
#define SCALER_DISPCTRL0 0x00000040
# define SCALER_DISPCTRLX_ENABLE BIT(31)
# define SCALER_DISPCTRLX_RESET BIT(30)
+/* Generates a single frame when VSTART is seen and stops at the last
+ * pixel read from the FIFO.
+ */
+# define SCALER_DISPCTRLX_ONESHOT BIT(29)
+/* Processes a single context in the dlist and then task switch,
+ * instead of an entire line.
+ */
+# define SCALER_DISPCTRLX_ONECTX BIT(28)
+/* Set to have DISPSLAVE return 2 16bpp pixels and no status data. */
+# define SCALER_DISPCTRLX_FIFO32 BIT(27)
+/* Turns on output to the DISPSLAVE register instead of the normal
+ * FIFO.
+ */
+# define SCALER_DISPCTRLX_FIFOREG BIT(26)
+
# define SCALER_DISPCTRLX_WIDTH_MASK VC4_MASK(23, 12)
# define SCALER_DISPCTRLX_WIDTH_SHIFT 12
# define SCALER_DISPCTRLX_HEIGHT_MASK VC4_MASK(11, 0)
@@ -402,6 +417,68 @@
*/
# define SCALER_GAMADDR_SRAMENB BIT(30)

+#define SCALER_OLEDOFFS 0x00000080
+/* Clamps R to [16,235] and G/B to [16,240]. */
+# define SCALER_OLEDOFFS_YUVCLAMP BIT(31)
+
+/* Chooses which display FIFO the matrix applies to. */
+# define SCALER_OLEDOFFS_DISPFIFO_MASK VC4_MASK(25, 24)
+# define SCALER_OLEDOFFS_DISPFIFO_SHIFT 24
+# define SCALER_OLEDOFFS_DISPFIFO_DISABLED 0
+# define SCALER_OLEDOFFS_DISPFIFO_0 1
+# define SCALER_OLEDOFFS_DISPFIFO_1 2
+# define SCALER_OLEDOFFS_DISPFIFO_2 3
+
+/* Offsets are 8-bit 2s-complement. */
+# define SCALER_OLEDOFFS_RED_MASK VC4_MASK(23, 16)
+# define SCALER_OLEDOFFS_RED_SHIFT 16
+# define SCALER_OLEDOFFS_GREEN_MASK VC4_MASK(15, 8)
+# define SCALER_OLEDOFFS_GREEN_SHIFT 8
+# define SCALER_OLEDOFFS_BLUE_MASK VC4_MASK(7, 0)
+# define SCALER_OLEDOFFS_BLUE_SHIFT 0
+
+/* The coefficients are S0.9 fractions. */
+#define SCALER_OLEDCOEF0 0x00000084
+# define SCALER_OLEDCOEF0_B_TO_R_MASK VC4_MASK(29, 20)
+# define SCALER_OLEDCOEF0_B_TO_R_SHIFT 20
+# define SCALER_OLEDCOEF0_B_TO_G_MASK VC4_MASK(19, 10)
+# define SCALER_OLEDCOEF0_B_TO_G_SHIFT 10
+# define SCALER_OLEDCOEF0_B_TO_B_MASK VC4_MASK(9, 0)
+# define SCALER_OLEDCOEF0_B_TO_B_SHIFT 0
+
+#define SCALER_OLEDCOEF1 0x00000088
+# define SCALER_OLEDCOEF1_G_TO_R_MASK VC4_MASK(29, 20)
+# define SCALER_OLEDCOEF1_G_TO_R_SHIFT 20
+# define SCALER_OLEDCOEF1_G_TO_G_MASK VC4_MASK(19, 10)
+# define SCALER_OLEDCOEF1_G_TO_G_SHIFT 10
+# define SCALER_OLEDCOEF1_G_TO_B_MASK VC4_MASK(9, 0)
+# define SCALER_OLEDCOEF1_G_TO_B_SHIFT 0
+
+#define SCALER_OLEDCOEF2 0x0000008c
+# define SCALER_OLEDCOEF2_R_TO_R_MASK VC4_MASK(29, 20)
+# define SCALER_OLEDCOEF2_R_TO_R_SHIFT 20
+# define SCALER_OLEDCOEF2_R_TO_G_MASK VC4_MASK(19, 10)
+# define SCALER_OLEDCOEF2_R_TO_G_SHIFT 10
+# define SCALER_OLEDCOEF2_R_TO_B_MASK VC4_MASK(9, 0)
+# define SCALER_OLEDCOEF2_R_TO_B_SHIFT 0
+
+/* Slave addresses for DMAing from HVS composition output to other
+ * devices. The top bits are valid only in !FIFO32 mode.
+ */
+#define SCALER_DISPSLAVE0 0x000000c0
+#define SCALER_DISPSLAVE1 0x000000c9
+#define SCALER_DISPSLAVE2 0x000000d0
+# define SCALER_DISPSLAVE_ISSUE_VSTART BIT(31)
+# define SCALER_DISPSLAVE_ISSUE_HSTART BIT(30)
+/* Set when the current line has been read and an HSTART is required. */
+# define SCALER_DISPSLAVE_EOL BIT(26)
+/* Set when the display FIFO is empty. */
+# define SCALER_DISPSLAVE_EMPTY BIT(25)
+/* Set when there is RGB data ready to read. */
+# define SCALER_DISPSLAVE_VALID BIT(24)
+# define SCALER_DISPSLAVE_RGB_MASK VC4_MASK(23, 0)
+# define SCALER_DISPSLAVE_RGB_SHIFT 0
+
#define SCALER_GAMDATA 0x000000e0
#define SCALER_DLIST_START 0x00002000
#define SCALER_DLIST_SIZE 0x00004000
@@ -767,6 +844,10 @@ enum hvs_pixel_format {
HVS_PIXEL_FORMAT_YCBCR_YUV420_2PLANE = 9,
HVS_PIXEL_FORMAT_YCBCR_YUV422_3PLANE = 10,
HVS_PIXEL_FORMAT_YCBCR_YUV422_2PLANE = 11,
+ HVS_PIXEL_FORMAT_H264 = 12,
+ HVS_PIXEL_FORMAT_PALETTE = 13,
+ HVS_PIXEL_FORMAT_YUV444_RGB = 14,
+ HVS_PIXEL_FORMAT_AYUV444_RGB = 15,
};

/* Note: the LSB is the rightmost character shown. Only valid for
@@ -800,12 +881,27 @@ enum hvs_pixel_format {
#define SCALER_CTL0_TILING_128B 2
#define SCALER_CTL0_TILING_256B_OR_T 3

+#define SCALER_CTL0_ALPHA_MASK BIT(19)
#define SCALER_CTL0_HFLIP BIT(16)
#define SCALER_CTL0_VFLIP BIT(15)

+#define SCALER_CTL0_KEY_MODE_MASK VC4_MASK(18, 17)
+#define SCALER_CTL0_KEY_MODE_SHIFT 17
+#define SCALER_CTL0_KEY_DISABLED 0
+#define SCALER_CTL0_KEY_LUMA_OR_COMMON_RGB 1
+#define SCALER_CTL0_KEY_MATCH 2 /* turn transparent */
+#define SCALER_CTL0_KEY_REPLACE 3 /* replace with value from key mask word 2 */
+
#define SCALER_CTL0_ORDER_MASK VC4_MASK(14, 13)
#define SCALER_CTL0_ORDER_SHIFT 13

+#define SCALER_CTL0_RGBA_EXPAND_MASK VC4_MASK(12, 11)
+#define SCALER_CTL0_RGBA_EXPAND_SHIFT 11
+#define SCALER_CTL0_RGBA_EXPAND_ZERO 0
+#define SCALER_CTL0_RGBA_EXPAND_LSB 1
+#define SCALER_CTL0_RGBA_EXPAND_MSB 2
+#define SCALER_CTL0_RGBA_EXPAND_ROUND 3
+
#define SCALER_CTL0_SCL1_MASK VC4_MASK(10, 8)
#define SCALER_CTL0_SCL1_SHIFT 8
--
2.14.1
Stefan Schake
2018-03-25 01:52:38 UTC
Permalink
We are an atomic driver so the gamma LUT should also be exposed as a
CRTC property through the DRM atomic color management. This will also
take care of the legacy path for us.

Signed-off-by: Stefan Schake <***@gmail.com>
---
v2: Use drm_color_lut_size for LUT length

drivers/gpu/drm/vc4/vc4_crtc.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index bf4667481935..239215cb3274 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -298,23 +298,21 @@ vc4_crtc_lut_load(struct drm_crtc *crtc)
HVS_WRITE(SCALER_GAMDATA, vc4_crtc->lut_b[i]);
}

-static int
-vc4_crtc_gamma_set(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
- uint32_t size,
- struct drm_modeset_acquire_ctx *ctx)
+static void
+vc4_crtc_update_gamma_lut(struct drm_crtc *crtc)
{
struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+ struct drm_color_lut *lut = crtc->state->gamma_lut->data;
+ u32 length = drm_color_lut_size(crtc->state->gamma_lut);
u32 i;

- for (i = 0; i < size; i++) {
- vc4_crtc->lut_r[i] = r[i] >> 8;
- vc4_crtc->lut_g[i] = g[i] >> 8;
- vc4_crtc->lut_b[i] = b[i] >> 8;
+ for (i = 0; i < length; i++) {
+ vc4_crtc->lut_r[i] = drm_color_lut_extract(lut[i].red, 8);
+ vc4_crtc->lut_g[i] = drm_color_lut_extract(lut[i].green, 8);
+ vc4_crtc->lut_b[i] = drm_color_lut_extract(lut[i].blue, 8);
}

vc4_crtc_lut_load(crtc);
-
- return 0;
}

static u32 vc4_get_fifo_full_level(u32 format)
@@ -699,6 +697,9 @@ static void vc4_crtc_atomic_flush(struct drm_crtc *crtc,
if (crtc->state->active && old_state->active)
vc4_crtc_update_dlist(crtc);

+ if (crtc->state->color_mgmt_changed && crtc->state->gamma_lut)
+ vc4_crtc_update_gamma_lut(crtc);
+
if (debug_dump_regs) {
DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc));
vc4_hvs_dump_state(dev);
@@ -909,7 +910,7 @@ static const struct drm_crtc_funcs vc4_crtc_funcs = {
.reset = vc4_crtc_reset,
.atomic_duplicate_state = vc4_crtc_duplicate_state,
.atomic_destroy_state = vc4_crtc_destroy_state,
- .gamma_set = vc4_crtc_gamma_set,
+ .gamma_set = drm_atomic_helper_legacy_gamma_set,
.enable_vblank = vc4_enable_vblank,
.disable_vblank = vc4_disable_vblank,
};
@@ -1035,6 +1036,7 @@ static int vc4_crtc_bind(struct device *dev, struct device *master, void *data)
primary_plane->crtc = crtc;
vc4_crtc->channel = vc4_crtc->data->hvs_channel;
drm_mode_crtc_set_gamma_size(crtc, ARRAY_SIZE(vc4_crtc->lut_r));
+ drm_crtc_enable_color_mgmt(crtc, 0, false, crtc->gamma_size);

/* Set up some arbitrary number of planes. We're not limited
* by a set number of physical registers, just the space in
--
2.14.1
Eric Anholt
2018-03-30 16:05:32 UTC
Permalink
Post by Stefan Schake
We are an atomic driver so the gamma LUT should also be exposed as a
CRTC property through the DRM atomic color management. This will also
take care of the legacy path for us.
---
v2: Use drm_color_lut_size for LUT length
drivers/gpu/drm/vc4/vc4_crtc.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index bf4667481935..239215cb3274 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -298,23 +298,21 @@ vc4_crtc_lut_load(struct drm_crtc *crtc)
HVS_WRITE(SCALER_GAMDATA, vc4_crtc->lut_b[i]);
}
-static int
-vc4_crtc_gamma_set(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
- uint32_t size,
- struct drm_modeset_acquire_ctx *ctx)
+static void
+vc4_crtc_update_gamma_lut(struct drm_crtc *crtc)
{
struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+ struct drm_color_lut *lut = crtc->state->gamma_lut->data;
+ u32 length = drm_color_lut_size(crtc->state->gamma_lut);
u32 i;
- for (i = 0; i < size; i++) {
- vc4_crtc->lut_r[i] = r[i] >> 8;
- vc4_crtc->lut_g[i] = g[i] >> 8;
- vc4_crtc->lut_b[i] = b[i] >> 8;
+ for (i = 0; i < length; i++) {
+ vc4_crtc->lut_r[i] = drm_color_lut_extract(lut[i].red, 8);
+ vc4_crtc->lut_g[i] = drm_color_lut_extract(lut[i].green, 8);
+ vc4_crtc->lut_b[i] = drm_color_lut_extract(lut[i].blue, 8);
}
vc4_crtc_lut_load(crtc);
-
- return 0;
}
static u32 vc4_get_fifo_full_level(u32 format)
@@ -699,6 +697,9 @@ static void vc4_crtc_atomic_flush(struct drm_crtc *crtc,
if (crtc->state->active && old_state->active)
vc4_crtc_update_dlist(crtc);
+ if (crtc->state->color_mgmt_changed && crtc->state->gamma_lut)
+ vc4_crtc_update_gamma_lut(crtc);
Don't we need to set things back to linear if gamma_lut is NULL? (maybe
by updating the SCALER_DISPBKGND_GAMMA flag on the HVS channel)

Other than that, this looks great.
Post by Stefan Schake
+
if (debug_dump_regs) {
DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc));
vc4_hvs_dump_state(dev);
@@ -909,7 +910,7 @@ static const struct drm_crtc_funcs vc4_crtc_funcs = {
.reset = vc4_crtc_reset,
.atomic_duplicate_state = vc4_crtc_duplicate_state,
.atomic_destroy_state = vc4_crtc_destroy_state,
- .gamma_set = vc4_crtc_gamma_set,
+ .gamma_set = drm_atomic_helper_legacy_gamma_set,
.enable_vblank = vc4_enable_vblank,
.disable_vblank = vc4_disable_vblank,
};
@@ -1035,6 +1036,7 @@ static int vc4_crtc_bind(struct device *dev, struct device *master, void *data)
primary_plane->crtc = crtc;
vc4_crtc->channel = vc4_crtc->data->hvs_channel;
drm_mode_crtc_set_gamma_size(crtc, ARRAY_SIZE(vc4_crtc->lut_r));
+ drm_crtc_enable_color_mgmt(crtc, 0, false, crtc->gamma_size);
/* Set up some arbitrary number of planes. We're not limited
* by a set number of physical registers, just the space in
--
2.14.1
Stefan Schake
2018-03-25 01:52:39 UTC
Permalink
The hardware supports a CTM with S0.9 values. We therefore only allow
a value of 1.0 or fractional only and reject all others with integer
parts. This restriction is mostly inconsequential in practice since
commonly used transformation matrices have all scalars <= 1.0.

Signed-off-by: Stefan Schake <***@gmail.com>
---
v2: Simplify CTM atomic check (Ville)

drivers/gpu/drm/vc4/vc4_crtc.c | 97 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 94 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 8d71098d00c4..bafb0102fe1d 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -315,6 +315,79 @@ vc4_crtc_update_gamma_lut(struct drm_crtc *crtc)
vc4_crtc_lut_load(crtc);
}

+/* Converts a DRM S31.32 value to the HW S0.9 format. */
+static u16 vc4_crtc_s31_32_to_s0_9(u64 in)
+{
+ u16 r;
+
+ /* Sign bit. */
+ r = in & BIT_ULL(63) ? BIT(9) : 0;
+ /* We have zero integer bits so we can only saturate here. */
+ if ((in & GENMASK_ULL(62, 32)) > 0)
+ r |= GENMASK(8, 0);
+ /* Otherwise take the 9 most important fractional bits. */
+ else
+ r |= (in >> 22) & GENMASK(8, 0);
+ return r;
+}
+
+static void
+vc4_crtc_update_ctm(struct drm_crtc *crtc)
+{
+ struct drm_device *dev = crtc->dev;
+ struct vc4_dev *vc4 = to_vc4_dev(dev);
+ struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+ struct drm_color_ctm *ctm = crtc->state->ctm->data;
+
+ HVS_WRITE(SCALER_OLEDCOEF2,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[0]),
+ SCALER_OLEDCOEF2_R_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[3]),
+ SCALER_OLEDCOEF2_R_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[6]),
+ SCALER_OLEDCOEF2_R_TO_B));
+ HVS_WRITE(SCALER_OLEDCOEF1,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[1]),
+ SCALER_OLEDCOEF1_G_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[4]),
+ SCALER_OLEDCOEF1_G_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[7]),
+ SCALER_OLEDCOEF1_G_TO_B));
+ HVS_WRITE(SCALER_OLEDCOEF0,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[2]),
+ SCALER_OLEDCOEF0_B_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[5]),
+ SCALER_OLEDCOEF0_B_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[8]),
+ SCALER_OLEDCOEF0_B_TO_B));
+
+ /* Channel is 0-based but for DISPFIFO, 0 means disabled. */
+ HVS_WRITE(SCALER_OLEDOFFS, VC4_SET_FIELD(vc4_crtc->channel + 1,
+ SCALER_OLEDOFFS_DISPFIFO));
+}
+
+/* Check if the CTM contains valid input.
+ *
+ * DRM exposes CTM with S31.32 scalars, but the HW only supports S0.9.
+ * We don't allow integer values >1, and 1 only without fractional part
+ * to handle the common 1.0 value.
+ */
+static int vc4_crtc_atomic_check_ctm(struct drm_crtc_state *state)
+{
+ struct drm_color_ctm *ctm = state->ctm->data;
+ u32 i;
+
+ for (i = 0; i < ARRAY_SIZE(ctm->matrix); i++) {
+ u64 val = ctm->matrix[i];
+
+ val &= ~BIT_ULL(63);
+ if (val > BIT_ULL(32))
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static u32 vc4_get_fifo_full_level(u32 format)
{
static const u32 fifo_len_bytes = 64;
@@ -621,6 +694,15 @@ static int vc4_crtc_atomic_check(struct drm_crtc *crtc,
if (hweight32(state->connector_mask) > 1)
return -EINVAL;

+ if (state->ctm) {
+ /* The CTM hardware has no integer bits, so we check
+ * and reject scalars >1.0 that we have no chance of
+ * approximating.
+ */
+ if (vc4_crtc_atomic_check_ctm(state))
+ return -EINVAL;
+ }
+
drm_atomic_crtc_state_for_each_plane_state(plane, plane_state, state)
dlist_count += vc4_plane_dlist_size(plane_state);

@@ -697,8 +779,17 @@ static void vc4_crtc_atomic_flush(struct drm_crtc *crtc,
if (crtc->state->active && old_state->active)
vc4_crtc_update_dlist(crtc);

- if (crtc->state->color_mgmt_changed && crtc->state->gamma_lut)
- vc4_crtc_update_gamma_lut(crtc);
+ if (crtc->state->color_mgmt_changed) {
+ if (crtc->state->gamma_lut)
+ vc4_crtc_update_gamma_lut(crtc);
+
+ if (crtc->state->ctm)
+ vc4_crtc_update_ctm(crtc);
+ /* We are transitioning to CTM disabled. */
+ else if (old_state->ctm)
+ HVS_WRITE(SCALER_OLEDOFFS,
+ VC4_SET_FIELD(0, SCALER_OLEDOFFS_DISPFIFO));
+ }

if (debug_dump_regs) {
DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc));
@@ -1036,7 +1127,7 @@ static int vc4_crtc_bind(struct device *dev, struct device *master, void *data)
primary_plane->crtc = crtc;
vc4_crtc->channel = vc4_crtc->data->hvs_channel;
drm_mode_crtc_set_gamma_size(crtc, ARRAY_SIZE(vc4_crtc->lut_r));
- drm_crtc_enable_color_mgmt(crtc, 0, false, crtc->gamma_size);
+ drm_crtc_enable_color_mgmt(crtc, 0, true, crtc->gamma_size);

/* Set up some arbitrary number of planes. We're not limited
* by a set number of physical registers, just the space in
--
2.14.1
Eric Anholt
2018-03-30 16:11:13 UTC
Permalink
Post by Stefan Schake
The hardware supports a CTM with S0.9 values. We therefore only allow
a value of 1.0 or fractional only and reject all others with integer
parts. This restriction is mostly inconsequential in practice since
commonly used transformation matrices have all scalars <= 1.0.
My primary concern with this patch is whether the OLEDCOEFFs apply
before or after gamma, since atomic is specific about what order they
happen in. I didn't find anything in the docs, so I'd have to pull up
RTL to confirm.
Post by Stefan Schake
---
v2: Simplify CTM atomic check (Ville)
drivers/gpu/drm/vc4/vc4_crtc.c | 97 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 94 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 8d71098d00c4..bafb0102fe1d 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -315,6 +315,79 @@ vc4_crtc_update_gamma_lut(struct drm_crtc *crtc)
vc4_crtc_lut_load(crtc);
}
+/* Converts a DRM S31.32 value to the HW S0.9 format. */
+static u16 vc4_crtc_s31_32_to_s0_9(u64 in)
+{
+ u16 r;
+
+ /* Sign bit. */
+ r = in & BIT_ULL(63) ? BIT(9) : 0;
+ /* We have zero integer bits so we can only saturate here. */
+ if ((in & GENMASK_ULL(62, 32)) > 0)
+ r |= GENMASK(8, 0);
+ /* Otherwise take the 9 most important fractional bits. */
+ else
+ r |= (in >> 22) & GENMASK(8, 0);
+ return r;
+}
+
+static void
+vc4_crtc_update_ctm(struct drm_crtc *crtc)
+{
+ struct drm_device *dev = crtc->dev;
+ struct vc4_dev *vc4 = to_vc4_dev(dev);
+ struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+ struct drm_color_ctm *ctm = crtc->state->ctm->data;
+
+ HVS_WRITE(SCALER_OLEDCOEF2,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[0]),
+ SCALER_OLEDCOEF2_R_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[3]),
+ SCALER_OLEDCOEF2_R_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[6]),
+ SCALER_OLEDCOEF2_R_TO_B));
+ HVS_WRITE(SCALER_OLEDCOEF1,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[1]),
+ SCALER_OLEDCOEF1_G_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[4]),
+ SCALER_OLEDCOEF1_G_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[7]),
+ SCALER_OLEDCOEF1_G_TO_B));
+ HVS_WRITE(SCALER_OLEDCOEF0,
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[2]),
+ SCALER_OLEDCOEF0_B_TO_R) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[5]),
+ SCALER_OLEDCOEF0_B_TO_G) |
+ VC4_SET_FIELD(vc4_crtc_s31_32_to_s0_9(ctm->matrix[8]),
+ SCALER_OLEDCOEF0_B_TO_B));
+
+ /* Channel is 0-based but for DISPFIFO, 0 means disabled. */
+ HVS_WRITE(SCALER_OLEDOFFS, VC4_SET_FIELD(vc4_crtc->channel + 1,
+ SCALER_OLEDOFFS_DISPFIFO));
+}
+
+/* Check if the CTM contains valid input.
+ *
+ * DRM exposes CTM with S31.32 scalars, but the HW only supports S0.9.
+ * We don't allow integer values >1, and 1 only without fractional part
+ * to handle the common 1.0 value.
+ */
+static int vc4_crtc_atomic_check_ctm(struct drm_crtc_state *state)
+{
+ struct drm_color_ctm *ctm = state->ctm->data;
+ u32 i;
+
+ for (i = 0; i < ARRAY_SIZE(ctm->matrix); i++) {
+ u64 val = ctm->matrix[i];
+
+ val &= ~BIT_ULL(63);
+ if (val > BIT_ULL(32))
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static u32 vc4_get_fifo_full_level(u32 format)
{
static const u32 fifo_len_bytes = 64;
@@ -621,6 +694,15 @@ static int vc4_crtc_atomic_check(struct drm_crtc *crtc,
if (hweight32(state->connector_mask) > 1)
return -EINVAL;
+ if (state->ctm) {
+ /* The CTM hardware has no integer bits, so we check
+ * and reject scalars >1.0 that we have no chance of
+ * approximating.
+ */
+ if (vc4_crtc_atomic_check_ctm(state))
+ return -EINVAL;
+ }
+
drm_atomic_crtc_state_for_each_plane_state(plane, plane_state, state)
dlist_count += vc4_plane_dlist_size(plane_state);
@@ -697,8 +779,17 @@ static void vc4_crtc_atomic_flush(struct drm_crtc *crtc,
if (crtc->state->active && old_state->active)
vc4_crtc_update_dlist(crtc);
- if (crtc->state->color_mgmt_changed && crtc->state->gamma_lut)
- vc4_crtc_update_gamma_lut(crtc);
+ if (crtc->state->color_mgmt_changed) {
+ if (crtc->state->gamma_lut)
+ vc4_crtc_update_gamma_lut(crtc);
+
+ if (crtc->state->ctm)
+ vc4_crtc_update_ctm(crtc);
+ /* We are transitioning to CTM disabled. */
+ else if (old_state->ctm)
+ HVS_WRITE(SCALER_OLEDOFFS,
+ VC4_SET_FIELD(0, SCALER_OLEDOFFS_DISPFIFO));
+ }
I find splitting the if from else without braces weird and error-prone.
Could we rewrite as:

+ if (crtc->state->ctm)
+ vc4_crtc_update_ctm(crtc);
+ else if (old_state->ctm) {
+ /* We are transitioning to CTM disabled. */
+ HVS_WRITE(SCALER_OLEDOFFS,
+ VC4_SET_FIELD(0, SCALER_OLEDOFFS_DISPFIFO));
+ }
+ }

Also, I think there might be a problem if you swap from CTM on CRTC1 to
CRTC 0 in a single commit -- CRTC0's CTM setup ends up getting disbaled
by CRTC1.

I think this should go away once you start tracking who's doing CTM at
the atomic state level, and put the SCALER_OLEDOFFS update into the
top-level atomic flush.

Stefan Schake
2018-03-25 18:14:35 UTC
Permalink
Hey Daniel,
Post by Daniel Stone
Hi Stefan,
+static int vc4_crtc_get_ctm_fifo(struct vc4_dev *vc4)
+{
+ return VC4_GET_FIELD(HVS_READ(SCALER_OLEDOFFS),
+ SCALER_OLEDOFFS_DISPFIFO);
+}
This needs to be managed as a global resource through atomic state
objects, rather than checking the current hardware state.
Do you mean as a property or some such that is accessible to userland
or merely that this could be raced?

I haven't had much luck finding examples for resources shared between CRTCs
in the current tree. My understanding here was that if userland commits
on CRTC B after a check-only on A, we are no longer bound by the earlier
result for the check-only. Otherwise, I would have to already commit my
CTM block to one CRTC at check (possibly check only) time.

Thanks,
Stefan
Daniel Vetter
2018-03-26 08:29:19 UTC
Permalink
Post by Stefan Schake
Hey Daniel,
Post by Daniel Stone
Hi Stefan,
+static int vc4_crtc_get_ctm_fifo(struct vc4_dev *vc4)
+{
+ return VC4_GET_FIELD(HVS_READ(SCALER_OLEDOFFS),
+ SCALER_OLEDOFFS_DISPFIFO);
+}
This needs to be managed as a global resource through atomic state
objects, rather than checking the current hardware state.
Do you mean as a property or some such that is accessible to userland
or merely that this could be raced?
I haven't had much luck finding examples for resources shared between CRTCs
in the current tree. My understanding here was that if userland commits
on CRTC B after a check-only on A, we are no longer bound by the earlier
result for the check-only. Otherwise, I would have to already commit my
CTM block to one CRTC at check (possibly check only) time.
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#handling-driver-private-state

since you only have one CTM it's a shared resource which internally needs
to be tracked as a driver private thing.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Daniel Stone
2018-03-26 10:52:50 UTC
Permalink
Post by Daniel Vetter
Post by Stefan Schake
Post by Daniel Stone
+static int vc4_crtc_get_ctm_fifo(struct vc4_dev *vc4)
+{
+ return VC4_GET_FIELD(HVS_READ(SCALER_OLEDOFFS),
+ SCALER_OLEDOFFS_DISPFIFO);
+}
This needs to be managed as a global resource through atomic state
objects, rather than checking the current hardware state.
Do you mean as a property or some such that is accessible to userland
or merely that this could be raced?
I haven't had much luck finding examples for resources shared between CRTCs
in the current tree. My understanding here was that if userland commits
on CRTC B after a check-only on A, we are no longer bound by the earlier
result for the check-only. Otherwise, I would have to already commit my
CTM block to one CRTC at check (possibly check only) time.
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#handling-driver-private-state
since you only have one CTM it's a shared resource which internally needs
to be tracked as a driver private thing.
Indeed, the above is exactly what I meant. Checking based on the
hardware status will falsely succeed if you go from having zero CRTCs
using CTM, to multiple CRTCs using CTM, in a single atomic commit, as
the hardware status won't have changed in time.

Cheers,
Daniel
Loading...