<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Apr 20, 2017 at 7:41 PM, Chris Johns <span dir="ltr"><<a href="mailto:chrisj@rtems.org" target="_blank">chrisj@rtems.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 21/04/2017 07:52, Patrick Gauvin wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Chris,<br>
<br>
<br>
+<br>
+static drvdata data;<br>
+/* TODO: Abstract DMA buffer retrieval better */<br>
<br>
<br>
Agreed.<br>
<br>
<br>
+static uint8_t *dma_buf = NULL;<br>
+<br>
+/* Check if bit is set in reg (and not masked by mask), and if<br>
it is, write<br>
+ * that bit to reg, returning true. Otherwise return false.<br>
+ */<br>
+static inline bool check_and_set(<br>
+ volatile uint32_t *reg,<br>
+ uint32_t mask,<br>
+ uint32_t bit<br>
+)<br>
+{<br>
+ if ( *reg & bit & ~mask )<br>
+ {<br>
+ *reg = bit;<br>
+ return true;<br>
+ }<br>
+ return false;<br>
+}<br>
+<br>
+/* Only one semaphore is used since only one interrupt is<br>
unmasked at a time.<br>
+ * The interrupt is cleared and masked after it is caught here.<br>
+ */<br>
+static void zynq_devcfg_isr(<br>
+ void *args<br>
+)<br>
<br>
<br>
Why are you wanting to use interrupts rather than a simple sleep and<br>
poll? I am ok with a sychronous read/write driver interface here.<br>
<br>
Do we need interrupts and the complexity they have here when loading<br>
the fabric is so critical to any other part of a system and must be<br>
done early?<br>
<br>
<br>
I moved to interrupts since I wanted a way to implement timeouts that<br>
allowed the driver to continue as soon as an operation completes. This<br>
is mostly to benefit applications that will read from and write to the<br>
PCAP during operation, after the bitstream is initially loaded, like an<br>
FPGA scrubber.<br>
</blockquote>
<br></div></div>
Thanks, this makes sense if time is important. I assume you want to scrub the fabric without resorting to a reset?<br>
<br>
Would it make sense to provide an ioctl to scrub the fabric and to verify it? It would seem like a useful function other users may need.<br>
<br>
In terms of the RTEMS implementation, I find event send and receive is simpler with a low overhead not needing to allocate a semaphore. The interrupt checks a task id variable and if not 0 sends an event and the receive blocks on the receive.<div><div class="h5"><br></div></div></blockquote><div><br></div><div>A scrub ioctl does make sense, however, there are various methods for performing scrubbing, and I don't think my lab will let me release the one we use publicly.</div><div><br></div><div>Ah, thank you. Events sound much better suited for this, I'll replace the semaphore with that.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> + int_sts = regs->int_sts;<br>
+ if (<br>
+ ZYNQ_DEVCFG_INT_AXI_WERR_INT_G<wbr>ET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_AXI_RTO_INT_GE<wbr>T( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_AXI_RERR_INT_G<wbr>ET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_RX_FIFO_OV_INT<wbr>_GET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_DMA_CMD_ERR_IN<wbr>T_GET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_DMA_Q_OV_INT_G<wbr>ET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_P2D_LEN_ERR_IN<wbr>T_GET( int_sts )<br>
+ || ZYNQ_DEVCFG_INT_PCFG_HMAC_ERR_<wbr>INT_GET( int_sts )<br>
+ )<br>
+ return -2;<br>
+<br>
+ return 0;<br>
+}<br>
+<br>
+/**<br>
+ * @brief Configure the PCAP controller.<br>
+ */<br>
+static void pl_init(<br>
+ volatile zynq_devcfg_regs *regs<br>
+)<br>
+{<br>
+ regs->ctrl = ZYNQ_DEVCFG_CTRL_PCAP_MODE( 1 ) |<br>
+ ZYNQ_DEVCFG_CTRL_PCAP_PR( ZYNQ_DEVCFG_CTRL_PCAP_PR_PCAP ) |<br>
+ ZYNQ_DEVCFG_CTRL_RESERVED_BITS | regs->ctrl;<br>
+ /* Disable loopback */<br>
+ regs->mctrl = ZYNQ_DEVCFG_MCTRL_SET(<br>
+ regs->mctrl,<br>
+ ~ZYNQ_DEVCFG_MCTRL_INT_PCAP_LP<wbr>BK( 1 ) & regs->mctrl<br>
+ );<br>
+ /* Clear all interrupts */<br>
+ regs->int_sts = ZYNQ_DEVCFG_INT_ALL;<br>
+<br>
+ if ( !data.secure )<br>
+ {<br>
+ if ( ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_<wbr>RATE_EN( regs->ctrl ) )<br>
+ regs->ctrl = ( ~ZYNQ_DEVCFG_CTRL_QUARTER_PCAP<wbr>_RATE_EN( 1<br>
) & regs->ctrl )<br>
+ | ZYNQ_DEVCFG_CTRL_RESERVED_BITS<wbr>;<br>
+ }<br>
+ else<br>
+ {<br>
+ if ( !ZYNQ_DEVCFG_CTRL_QUARTER_PCAP<wbr>_RATE_EN( regs->ctrl ) )<br>
+ regs->ctrl = ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_<wbr>RATE_EN( 1 ) |<br>
regs->ctrl<br>
+ | ZYNQ_DEVCFG_CTRL_RESERVED_BITS<wbr>;<br>
<br>
<br>
Have you tested secure loading?<br>
<br>
<br>
No, I have not yet.<br>
<br>
</blockquote>
<br></div></div>
I am fine with this waiting until after the code has been merged. I can then help sort out secure loading. It is tricky to handle as you need a suitable key embedded in the device and that means a battery for the BBRAM to avoid needing to blow fuses. I have custom boards I can test this on.</blockquote><div><br></div><div>Thanks, this does sound like it will take some time. I do have a board with the eFUSE blown, so I should be able to assist in testing when the time comes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> +rtems_device_driver zynq_devcfg_read(<br>
+ rtems_device_major_number major,<br>
+ rtems_device_minor_number minor,<br>
+ void *args<br>
+)<br>
+{<br>
+ rtems_libio_rw_args_t *rw_args;<br>
+ int status;<br>
+ rtems_status_code rstatus;<br>
+ rtems_status_code final_status;<br>
+ uint8_t *data_buf;<br>
+<br>
+ (void) major;<br>
+ (void) minor;<br>
+ rw_args = args;<br>
+ rw_args->bytes_moved = 0;<br>
+<br>
+ rstatus = rtems_semaphore_obtain( data.sem_id_internal,<br>
RTEMS_NO_WAIT, 0 );<br>
+ if ( RTEMS_SUCCESSFUL != rstatus )<br>
+ {<br>
+ final_status = RTEMS_RESOURCE_IN_USE;<br>
+ goto err_obtain;<br>
+ }<br>
+<br>
+ if ( rw_args->count < 4 )<br>
+ {<br>
+ final_status = RTEMS_INVALID_SIZE;<br>
+ goto err_insane;<br>
+ }<br>
+ /* TODO: It might be valid to read configuration registers<br>
while the PL is<br>
+ * not programmed.<br>
+ */<br>
+ /* PCFG_DONE must be asserted before readback */<br>
+ if ( !ZYNQ_DEVCFG_INT_PCFG_DONE_INT<wbr>_GET( data.regs->int_sts ) )<br>
+ {<br>
+ WARN( "read attempted when FPGA configuration not done\n" );<br>
+ final_status = RTEMS_IO_ERROR;<br>
+ goto err_insane;<br>
+ }<br>
+<br>
+ if ( 0 != (size_t)rw_args->buffer % ZYNQ_DEVCFG_PCAP_DMA_ALIGN )<br>
+ data_buf = dma_buf_get( rw_args->count );<br>
+ else<br>
+ data_buf = (uint8_t *)rw_args->buffer;<br>
+<br>
+ status = pcap_dma_xfer(<br>
+ data.regs,<br>
+ (uint32_t *)ZYNQ_DEVCFG_BITSTREAM_ADDR,<br>
+ rw_args->count / 4,<br>
+ (uint32_t *)( (uint32_t)data_buf | 1 ),<br>
+ rw_args->count / 4<br>
+ );<br>
<br>
<br>
What happens if the fabric is loaded with a secured bitfile? Is the<br>
hardware nice about you wanting to read a secured image or does it<br>
just stop and you need a reset?<br>
<br>
What is the use case for a read? I only ask because you must have<br>
loaded the fabric in the first place so your application should have<br>
that file somewhere.<br>
<br>
<br>
Now that you mention it, I remember reading something about a lockdown<br>
related to this, I'll have to check the TRM.<br>
<br>
The primary use case for reads that I see is an FPGA scrubber which<br>
checks and corrects configuration errors. At least in our lab, this is<br>
the primary use.<br>
</blockquote>
<br></div></div>
Maybe it is simpler to track secure loading and if that happens do not allow a read back? I think it is safe to assume this driver is the only path to the PCAP.</blockquote><div><br></div><div>Sounds good.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> + rstatus = rtems_semaphore_release( data.sem_id_internal );<br>
+err_insane:<br>
+ if ( RTEMS_SUCCESSFUL != rstatus )<br>
+ final_status = RTEMS_INTERNAL_ERROR;<br>
+err_obtain:<br>
+ return final_status;<br>
+}<br>
+<br>
+rtems_device_driver zynq_devcfg_write(<br>
+ rtems_device_major_number major,<br>
+ rtems_device_minor_number minor,<br>
+ void *args<br>
+)<br>
+{<br>
+ rtems_libio_rw_args_t *rw_args;<br>
+ int status;<br>
+ rtems_status_code rstatus;<br>
+ rtems_status_code final_status;<br>
+ uint8_t *data_buf;<br>
+<br>
+ (void) major;<br>
+ (void) minor;<br>
+ rw_args = args;<br>
+ rw_args->bytes_moved = 0;<br>
+<br>
+ rstatus = rtems_semaphore_obtain( data.sem_id_internal,<br>
RTEMS_NO_WAIT, 0 );<br>
+ if ( RTEMS_SUCCESSFUL != rstatus )<br>
+ {<br>
+ final_status = RTEMS_RESOURCE_IN_USE;<br>
+ goto err_obtain;<br>
+ }<br>
<br>
<br>
I suggest a check of the data for a valid series of dummy words, the<br>
bus width auto detect and a sync word. This sequence is used by the<br>
PCAP loader to sync and align the bit stream to be processed.<br>
<br>
For example here is a function that determines the endian format of<br>
the image before it is loaded:<br>
<br>
static bool image_check_endian(const uint32_t* image)<br>
{<br>
/* Xilinx binary format header */<br>
const uint32_t bin_format[] = {<br>
0xffffffff, /* Dummy words */<br>
0xffffffff,<br>
0x000000bb, /* Bus width auto detect, word 1 */<br>
0x11220044, /* Bus width auto detect, word 2 */<br>
0xffffffff,<br>
0xffffffff,<br>
0xaa995566, /* Sync word */<br>
};<br>
size_t i;<br>
<br>
for (i = 0; i < (sizeof(bin_format) / sizeof(uint32_t)); i++) {<br>
if (image[i] != bin_format[i]) {<br>
return false;<br>
}<br>
}<br>
<br>
return true;<br>
<br>
}<br>
<br>
<br>
This is a good check to have, however I'll need to check that the<br>
commands used with the PCAP after a full programming have a similar header.<br>
<br>
</blockquote>
<br></div></div>
I found the doco. The details are in UG470 (<a href="https://www.xilinx.com/support/documentation/user_guides/ug470_7Series_Config.pdf" rel="noreferrer" target="_blank">https://www.xilinx.com/suppor<wbr>t/documentation/user_guides/<wbr>ug470_7Series_Config.pdf</a>) Section 5.<br>
<br>
Note, there are registers not documented here related to secure loading that are covered by an NDA. I have refused offers to that documentation. Currently the only tool I know that support those registers is `bootgen`.<br>
<br>
Also to be full compliant we should detect the byte order and manage the swap but this can wait.</blockquote><div><br></div><div>Thanks, I'll use that to add the secure check mentioned above.</div><div><br></div><div>OK, I'll make sure that the byte order requirement is mentioned in the documentation comments for now.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> +rtems_device_driver zynq_devcfg_control(<br>
+ rtems_device_major_number major,<br>
+ rtems_device_minor_number minor,<br>
+ void *args<br>
+)<br>
+{<br>
+ rtems_libio_ioctl_args_t *ioctl_args;<br>
+ char *str;<br>
+ int status;<br>
+ rtems_status_code rstatus;<br>
+ rtems_status_code final_status;<br>
+<br>
+ (void) major;<br>
+ (void) minor;<br>
+ ioctl_args = args;<br>
+<br>
+ rstatus = rtems_semaphore_obtain( data.sem_id_internal,<br>
RTEMS_NO_WAIT, 0 );<br>
+ if ( RTEMS_UNSATISFIED == rstatus )<br>
+ {<br>
+ ioctl_args->ioctl_return = -1;<br>
+ return RTEMS_RESOURCE_IN_USE;<br>
+ }<br>
+ else if ( RTEMS_SUCCESSFUL != rstatus )<br>
+ {<br>
+ ioctl_args->ioctl_return = -1;<br>
+ return RTEMS_INTERNAL_ERROR;<br>
+ }<br>
+<br>
+ final_status = RTEMS_SUCCESSFUL;<br>
+ ioctl_args->ioctl_return = 0;<br>
+ switch ( ioctl_args->command ) {<br>
+ case ZYNQ_DEVCFG_IOCTL_VERSION:<br>
+ str = ioctl_args->buffer;<br>
+ switch( ZYNQ_DEVCFG_MCTRL_PS_VERSION_G<wbr>ET( data.regs->mctrl<br>
) ) {<br>
+ case ZYNQ_DEVCFG_MCTRL_PS_VERSION_1<wbr>_0:<br>
+ strncpy( str, "1.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_<wbr>LEN );<br>
+ break;<br>
+ case ZYNQ_DEVCFG_MCTRL_PS_VERSION_2<wbr>_0:<br>
+ strncpy( str, "2.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_<wbr>LEN );<br>
+ break;<br>
+ case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3<wbr>_0:<br>
+ strncpy( str, "3.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_<wbr>LEN );<br>
+ break;<br>
+ case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3<wbr>_1:<br>
+ strncpy( str, "3.1", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_<wbr>LEN );<br>
+ break;<br>
+ default:<br>
+ strncpy( str, "???", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_<wbr>LEN );<br>
+ break;<br>
+ }<br>
+ break;<br>
+ case ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM<wbr>_PRE:<br>
+ pl_init( data.regs );<br>
+ /* Hold FPGA clocks in reset */<br>
+ zynq_slcr_fpga_clk_rst( 0xf );<br>
+ /* Enable PS to PL level shifters */<br>
+ zynq_slcr_level_shifter_enable<wbr>(<br>
ZYNQ_SLCR_LVL_SHFTR_EN_DISABLE );<br>
+ zynq_slcr_level_shifter_enable<wbr>(<br>
ZYNQ_SLCR_LVL_SHFTR_EN_PS_TO_P<wbr>L );<br>
+ status = pl_clear( data.regs );<br>
+ if ( 0 != status )<br>
+ {<br>
+ ioctl_args->ioctl_return = -1;<br>
+ final_status = RTEMS_UNSATISFIED;<br>
+ }<br>
<br>
<br>
Why are the load phases broken out like this? A use case type of<br>
example would be nice.<br>
<br>
<br>
The load phases are broken out to keep the write function generic enough<br>
to be used after the initial configuration of the FPGA. Here is an<br>
example of an initial load, minus error checking:<br>
<br>
ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM<wbr>_PRE );<br>
write( fd, BITSTREAM_DATA, BITSTREAM_LEN );<br>
ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM<wbr>_WAIT_DONE );<br>
ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM<wbr>_POST );<br>
<br>
</blockquote>
<br></div></div>
I am wondering if it makes sense to have a couple of modes for the driver. One where a write is synchronous and does the above and one where you manually to follow this pattern?<br>
<br>
I feel the simplified case is more common than the specialised case so a simplified write is more robust. If for some reason this pattern needs to change the impact to those who reconfigure or support partial loading is smaller.</blockquote><div><br></div><div>True, I'll add an ioctl to switch modes. Also, this will make the BIN header check straightforward for the common case.</div><div><br></div><div>-Patrick</div></div><br></div></div>