[PATCH 2/2] bsp/xilinx-zynq: Add device configuration driver

Patrick Gauvin pggauvin at gmail.com
Fri Apr 21 16:24:19 UTC 2017


On Thu, Apr 20, 2017 at 7:41 PM, Chris Johns <chrisj at rtems.org> wrote:

> On 21/04/2017 07:52, Patrick Gauvin wrote:
>
>> Chris,
>>
>>
>>         +
>>         +static drvdata data;
>>         +/* TODO: Abstract DMA buffer retrieval better */
>>
>>
>>     Agreed.
>>
>>
>>         +static uint8_t *dma_buf = NULL;
>>         +
>>         +/* Check if bit is set in reg (and not masked by mask), and if
>>         it is, write
>>         + * that bit to reg, returning true. Otherwise return false.
>>         + */
>>         +static inline bool check_and_set(
>>         +  volatile uint32_t *reg,
>>         +  uint32_t           mask,
>>         +  uint32_t           bit
>>         +)
>>         +{
>>         +  if ( *reg & bit & ~mask )
>>         +  {
>>         +    *reg = bit;
>>         +    return true;
>>         +  }
>>         +  return false;
>>         +}
>>         +
>>         +/* Only one semaphore is used since only one interrupt is
>>         unmasked at a time.
>>         + * The interrupt is cleared and masked after it is caught here.
>>         + */
>>         +static void zynq_devcfg_isr(
>>         +  void *args
>>         +)
>>
>>
>>     Why are you wanting to use interrupts rather than a simple sleep and
>>     poll? I am ok with a sychronous read/write driver interface here.
>>
>>     Do we need interrupts and the complexity they have here when loading
>>     the fabric is so critical to any other part of a system and must be
>>     done early?
>>
>>
>> I moved to interrupts since I wanted a way to implement timeouts that
>> allowed the driver to continue as soon as an operation completes. This
>> is mostly to benefit applications that will read from and write to the
>> PCAP during operation, after the bitstream is initially loaded, like an
>> FPGA scrubber.
>>
>
> Thanks, this makes sense if time is important. I assume you want to scrub
> the fabric without resorting to a reset?
>
> Would it make sense to provide an ioctl to scrub the fabric and to verify
> it? It would seem like a useful function other users may need.
>
> In terms of the RTEMS implementation, I find event send and receive is
> simpler with a low overhead not needing to allocate a semaphore. The
> interrupt checks a task id variable and if not 0 sends an event and the
> receive blocks on the receive.
>
>
A scrub ioctl does make sense, however, there are various methods for
performing scrubbing, and I don't think my lab will let me release the one
we use publicly.

Ah, thank you. Events sound much better suited for this, I'll replace the
semaphore with that.


>         +  int_sts = regs->int_sts;
>>         +  if (
>>         +    ZYNQ_DEVCFG_INT_AXI_WERR_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_AXI_RTO_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_AXI_RERR_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_RX_FIFO_OV_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_DMA_CMD_ERR_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_DMA_Q_OV_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_P2D_LEN_ERR_INT_GET( int_sts )
>>         +    || ZYNQ_DEVCFG_INT_PCFG_HMAC_ERR_INT_GET( int_sts )
>>         +  )
>>         +    return -2;
>>         +
>>         +  return 0;
>>         +}
>>         +
>>         +/**
>>         + * @brief Configure the PCAP controller.
>>         + */
>>         +static void pl_init(
>>         +  volatile zynq_devcfg_regs *regs
>>         +)
>>         +{
>>         +  regs->ctrl = ZYNQ_DEVCFG_CTRL_PCAP_MODE( 1 ) |
>>         +      ZYNQ_DEVCFG_CTRL_PCAP_PR( ZYNQ_DEVCFG_CTRL_PCAP_PR_PCAP ) |
>>         +      ZYNQ_DEVCFG_CTRL_RESERVED_BITS | regs->ctrl;
>>         +  /* Disable loopback */
>>         +  regs->mctrl = ZYNQ_DEVCFG_MCTRL_SET(
>>         +    regs->mctrl,
>>         +    ~ZYNQ_DEVCFG_MCTRL_INT_PCAP_LPBK( 1 ) & regs->mctrl
>>         +  );
>>         +  /* Clear all interrupts */
>>         +  regs->int_sts = ZYNQ_DEVCFG_INT_ALL;
>>         +
>>         +  if ( !data.secure )
>>         +  {
>>         +    if ( ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( regs->ctrl ) )
>>         +      regs->ctrl = ( ~ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( 1
>>         ) & regs->ctrl )
>>         +          | ZYNQ_DEVCFG_CTRL_RESERVED_BITS;
>>         +  }
>>         +  else
>>         +  {
>>         +    if ( !ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( regs->ctrl ) )
>>         +      regs->ctrl = ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( 1 ) |
>>         regs->ctrl
>>         +          | ZYNQ_DEVCFG_CTRL_RESERVED_BITS;
>>
>>
>>     Have you tested secure loading?
>>
>>
>> No, I have not yet.
>>
>>
> I am fine with this waiting until after the code has been merged. I can
> then help sort out secure loading. It is tricky to handle as you need a
> suitable key embedded in the device and that means a battery for the BBRAM
> to avoid needing to blow fuses. I have custom boards I can test this on.


Thanks, this does sound like it will take some time. I do have a board with
the eFUSE blown, so I should be able to assist in testing when the time
comes.

        +rtems_device_driver zynq_devcfg_read(
>>         +  rtems_device_major_number  major,
>>         +  rtems_device_minor_number  minor,
>>         +  void                      *args
>>         +)
>>         +{
>>         +  rtems_libio_rw_args_t *rw_args;
>>         +  int                    status;
>>         +  rtems_status_code      rstatus;
>>         +  rtems_status_code      final_status;
>>         +  uint8_t               *data_buf;
>>         +
>>         +  (void) major;
>>         +  (void) minor;
>>         +  rw_args = args;
>>         +  rw_args->bytes_moved = 0;
>>         +
>>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>>         RTEMS_NO_WAIT, 0 );
>>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>>         +  {
>>         +    final_status = RTEMS_RESOURCE_IN_USE;
>>         +    goto err_obtain;
>>         +  }
>>         +
>>         +  if ( rw_args->count < 4 )
>>         +  {
>>         +    final_status = RTEMS_INVALID_SIZE;
>>         +    goto err_insane;
>>         +  }
>>         +  /* TODO: It might be valid to read configuration registers
>>         while the PL is
>>         +   * not programmed.
>>         +   */
>>         +  /* PCFG_DONE must be asserted before readback */
>>         +  if ( !ZYNQ_DEVCFG_INT_PCFG_DONE_INT_GET( data.regs->int_sts )
>> )
>>         +  {
>>         +    WARN( "read attempted when FPGA configuration not done\n" );
>>         +    final_status = RTEMS_IO_ERROR;
>>         +    goto err_insane;
>>         +  }
>>         +
>>         +  if ( 0 != (size_t)rw_args->buffer % ZYNQ_DEVCFG_PCAP_DMA_ALIGN
>> )
>>         +    data_buf = dma_buf_get( rw_args->count );
>>         +  else
>>         +    data_buf = (uint8_t *)rw_args->buffer;
>>         +
>>         +  status = pcap_dma_xfer(
>>         +    data.regs,
>>         +    (uint32_t *)ZYNQ_DEVCFG_BITSTREAM_ADDR,
>>         +    rw_args->count / 4,
>>         +    (uint32_t *)( (uint32_t)data_buf | 1 ),
>>         +    rw_args->count / 4
>>         +  );
>>
>>
>>     What happens if the fabric is loaded with a secured bitfile? Is the
>>     hardware nice about you wanting to read a secured image or does it
>>     just stop and you need a reset?
>>
>>     What is the use case for a read? I only ask because you must have
>>     loaded the fabric in the first place so your application should have
>>     that file somewhere.
>>
>>
>> Now that you mention it, I remember reading something about a lockdown
>> related to this, I'll have to check the TRM.
>>
>> The primary use case for reads that I see is an FPGA scrubber which
>> checks and corrects configuration errors. At least in our lab, this is
>> the primary use.
>>
>
> Maybe it is simpler to track secure loading and if that happens do not
> allow a read back? I think it is safe to assume this driver is the only
> path to the PCAP.


Sounds good.


>         +  rstatus = rtems_semaphore_release( data.sem_id_internal );
>>         +err_insane:
>>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>>         +    final_status = RTEMS_INTERNAL_ERROR;
>>         +err_obtain:
>>         +  return final_status;
>>         +}
>>         +
>>         +rtems_device_driver zynq_devcfg_write(
>>         +  rtems_device_major_number  major,
>>         +  rtems_device_minor_number  minor,
>>         +  void                      *args
>>         +)
>>         +{
>>         +  rtems_libio_rw_args_t *rw_args;
>>         +  int                    status;
>>         +  rtems_status_code      rstatus;
>>         +  rtems_status_code      final_status;
>>         +  uint8_t               *data_buf;
>>         +
>>         +  (void) major;
>>         +  (void) minor;
>>         +  rw_args = args;
>>         +  rw_args->bytes_moved = 0;
>>         +
>>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>>         RTEMS_NO_WAIT, 0 );
>>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>>         +  {
>>         +    final_status = RTEMS_RESOURCE_IN_USE;
>>         +    goto err_obtain;
>>         +  }
>>
>>
>>     I suggest a check of the data for a valid series of dummy words, the
>>     bus width auto detect and a sync word. This sequence is used by the
>>     PCAP loader to sync and align the bit stream to be processed.
>>
>>     For example here is a function that determines the endian format of
>>     the image before it is loaded:
>>
>>     static bool image_check_endian(const uint32_t* image)
>>     {
>>       /* Xilinx binary format header */
>>       const uint32_t bin_format[] = {
>>         0xffffffff, /* Dummy words */
>>         0xffffffff,
>>         0x000000bb, /* Bus width auto detect, word 1 */
>>         0x11220044, /* Bus width auto detect, word 2 */
>>         0xffffffff,
>>         0xffffffff,
>>         0xaa995566, /* Sync word */
>>       };
>>       size_t i;
>>
>>       for (i = 0; i < (sizeof(bin_format) / sizeof(uint32_t)); i++) {
>>         if (image[i] != bin_format[i]) {
>>           return false;
>>         }
>>       }
>>
>>       return true;
>>
>>     }
>>
>>
>> This is a good check to have, however I'll need to check that the
>> commands used with the PCAP after a full programming have a similar
>> header.
>>
>>
> I found the doco. The details are in UG470 (https://www.xilinx.com/suppor
> t/documentation/user_guides/ug470_7Series_Config.pdf) Section 5.
>
> Note, there are registers not documented here related to secure loading
> that are covered by an NDA. I have refused offers to that documentation.
> Currently the only tool I know that support those registers is `bootgen`.
>
> Also to be full compliant we should detect the byte order and manage the
> swap but this can wait.


Thanks, I'll use that to add the secure check mentioned above.

OK, I'll make sure that the byte order requirement is mentioned in the
documentation comments for now.

        +rtems_device_driver zynq_devcfg_control(
>>         +  rtems_device_major_number  major,
>>         +  rtems_device_minor_number  minor,
>>         +  void                      *args
>>         +)
>>         +{
>>         +  rtems_libio_ioctl_args_t *ioctl_args;
>>         +  char                     *str;
>>         +  int                       status;
>>         +  rtems_status_code         rstatus;
>>         +  rtems_status_code         final_status;
>>         +
>>         +  (void) major;
>>         +  (void) minor;
>>         +  ioctl_args = args;
>>         +
>>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>>         RTEMS_NO_WAIT, 0 );
>>         +  if ( RTEMS_UNSATISFIED == rstatus )
>>         +  {
>>         +    ioctl_args->ioctl_return = -1;
>>         +    return RTEMS_RESOURCE_IN_USE;
>>         +  }
>>         +  else if ( RTEMS_SUCCESSFUL != rstatus )
>>         +  {
>>         +    ioctl_args->ioctl_return = -1;
>>         +    return RTEMS_INTERNAL_ERROR;
>>         +  }
>>         +
>>         +  final_status = RTEMS_SUCCESSFUL;
>>         +  ioctl_args->ioctl_return = 0;
>>         +  switch ( ioctl_args->command ) {
>>         +  case ZYNQ_DEVCFG_IOCTL_VERSION:
>>         +    str = ioctl_args->buffer;
>>         +    switch( ZYNQ_DEVCFG_MCTRL_PS_VERSION_GET( data.regs->mctrl
>>         ) ) {
>>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_1_0:
>>         +        strncpy( str, "1.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN
>> );
>>         +        break;
>>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_2_0:
>>         +        strncpy( str, "2.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN
>> );
>>         +        break;
>>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_0:
>>         +        strncpy( str, "3.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN
>> );
>>         +        break;
>>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_1:
>>         +        strncpy( str, "3.1", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN
>> );
>>         +        break;
>>         +      default:
>>         +        strncpy( str, "???", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN
>> );
>>         +        break;
>>         +    }
>>         +    break;
>>         +  case ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_PRE:
>>         +    pl_init( data.regs );
>>         +    /* Hold FPGA clocks in reset */
>>         +    zynq_slcr_fpga_clk_rst( 0xf );
>>         +    /* Enable PS to PL level shifters */
>>         +    zynq_slcr_level_shifter_enable(
>>         ZYNQ_SLCR_LVL_SHFTR_EN_DISABLE );
>>         +    zynq_slcr_level_shifter_enable(
>>         ZYNQ_SLCR_LVL_SHFTR_EN_PS_TO_PL );
>>         +    status = pl_clear( data.regs );
>>         +    if ( 0 != status )
>>         +    {
>>         +      ioctl_args->ioctl_return = -1;
>>         +      final_status = RTEMS_UNSATISFIED;
>>         +    }
>>
>>
>>     Why are the load phases broken out like this? A use case type of
>>     example would be nice.
>>
>>
>> The load phases are broken out to keep the write function generic enough
>> to be used after the initial configuration of the FPGA. Here is an
>> example of an initial load, minus error checking:
>>
>> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_PRE );
>> write( fd, BITSTREAM_DATA, BITSTREAM_LEN );
>> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_WAIT_DONE );
>> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_POST );
>>
>>
> I am wondering if it makes sense to have a couple of modes for the driver.
> One where a write is synchronous and does the above and one where you
> manually to follow this pattern?
>
> I feel the simplified case is more common than the specialised case so a
> simplified write is more robust. If for some reason this pattern needs to
> change the impact to those who reconfigure or support partial loading is
> smaller.


True, I'll add an ioctl to switch modes. Also, this will make the BIN
header check straightforward for the common case.

-Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20170421/41f11d42/attachment-0002.html>


More information about the devel mailing list