[PATCH 2/2] bsp/xilinx-zynq: Add device configuration driver

Chris Johns chrisj at rtems.org
Thu Apr 20 23:41:55 UTC 2017


On 21/04/2017 07:52, Patrick Gauvin wrote:
> Chris,
>
>
>         +
>         +static drvdata data;
>         +/* TODO: Abstract DMA buffer retrieval better */
>
>
>     Agreed.
>
>
>         +static uint8_t *dma_buf = NULL;
>         +
>         +/* Check if bit is set in reg (and not masked by mask), and if
>         it is, write
>         + * that bit to reg, returning true. Otherwise return false.
>         + */
>         +static inline bool check_and_set(
>         +  volatile uint32_t *reg,
>         +  uint32_t           mask,
>         +  uint32_t           bit
>         +)
>         +{
>         +  if ( *reg & bit & ~mask )
>         +  {
>         +    *reg = bit;
>         +    return true;
>         +  }
>         +  return false;
>         +}
>         +
>         +/* Only one semaphore is used since only one interrupt is
>         unmasked at a time.
>         + * The interrupt is cleared and masked after it is caught here.
>         + */
>         +static void zynq_devcfg_isr(
>         +  void *args
>         +)
>
>
>     Why are you wanting to use interrupts rather than a simple sleep and
>     poll? I am ok with a sychronous read/write driver interface here.
>
>     Do we need interrupts and the complexity they have here when loading
>     the fabric is so critical to any other part of a system and must be
>     done early?
>
>
> I moved to interrupts since I wanted a way to implement timeouts that
> allowed the driver to continue as soon as an operation completes. This
> is mostly to benefit applications that will read from and write to the
> PCAP during operation, after the bitstream is initially loaded, like an
> FPGA scrubber.

Thanks, this makes sense if time is important. I assume you want to 
scrub the fabric without resorting to a reset?

Would it make sense to provide an ioctl to scrub the fabric and to 
verify it? It would seem like a useful function other users may need.

In terms of the RTEMS implementation, I find event send and receive is 
simpler with a low overhead not needing to allocate a semaphore. The 
interrupt checks a task id variable and if not 0 sends an event and the 
receive blocks on the receive.

>
>
>
>         +{
>         +  uint32_t intrs[] = {
>         +    ZYNQ_DEVCFG_INT_DMA_DONE_INT,
>         +    ZYNQ_DEVCFG_INT_PCFG_INIT_PE_INT,
>         +    ZYNQ_DEVCFG_INT_PCFG_INIT_NE_INT,
>         +    ZYNQ_DEVCFG_INT_PCFG_DONE_INT,
>         +    ZYNQ_DEVCFG_INT_PSS_CFG_RESET_B_INT
>         +  };
>         +  volatile uint32_t *int_sts = &data.regs->int_sts;
>         +  volatile uint32_t *int_mask = &data.regs->int_mask;
>         +
>         +  (void) args;
>         +
>         +  for ( size_t i = 0; i < RTEMS_ARRAY_SIZE( intrs ); ++i )
>         +    if ( check_and_set( int_sts, *int_mask, intrs[i] ) )
>         +    {
>         +      *int_mask |= intrs[i];
>         +      rtems_semaphore_release( data.sem_id_int_wait );
>         +      return;
>         +    }
>         +}
>         +
>         +static size_t get_bitstream_len( void )
>         +{
>         +  switch ( ZYNQ_SLCR_PSS_IDCODE_DEVICE_GET(
>         zynq_slcr_pss_idcode_get() ) )
>         +  {
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z007s:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z007s;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z012s:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z012s;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z014s:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z014s;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z010:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z010;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z015:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z015;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z020:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z020;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z030:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z030;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z035:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z035;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z045:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z045;
>         +  case ZYNQ_SLCR_PSS_IDCODE_DEVICE_7z100:
>         +    return ZYNQ_DEVCFG_BITSTREAM_LEN_7z100;
>         +  default:
>         +    return 0;
>         +  }
>         +}
>
>
>     See below.
>
>         +
>         +/**
>         + * @brief Create an aligned buffer for the bitstream.
>         + *
>         + * @param Desired length of the buffer in bytes.
>         + *
>         + * @retval NULL malloc failed.
>         + */
>         +static uint8_t *dma_buf_get(
>         +  size_t len
>         +)
>         +{
>         +  dma_buf = malloc( len + ZYNQ_DEVCFG_PCAP_DMA_ALIGN );
>         +  if ( NULL == dma_buf )
>         +    return NULL;
>         +  if ( (size_t)dma_buf % ZYNQ_DEVCFG_PCAP_DMA_ALIGN != 0 )
>         +  {
>         +    return dma_buf + ZYNQ_DEVCFG_PCAP_DMA_ALIGN
>         +        - ( (size_t)dma_buf % ZYNQ_DEVCFG_PCAP_DMA_ALIGN );
>         +  }
>         +  return dma_buf;
>
>
>     This and the next function need to be fixed to remove the static global.
>
>
> OK.
>
>
>
>         +}
>         +
>         +static void dma_buf_release( void )
>         +{
>         +  free( dma_buf );
>         +}
>         +
>         +/**
>         + * @brief Initiates a PCAP DMA transfer.
>         + *
>         + * @param src[in] For programming the FPGA, this is the
>         location of the
>         + * bitstream data. For readback, it is the location of the PL
>         readback command
>         + * sequence.
>         + * @param src_len Typically the length of bitstream in dwords,
>         or the number of
>         + * PL commands. The user must check this value for correctness.
>         + * @param dst[in,out] For programming the FPGA use
>         ZYNQ_DEVCFG_BITSTREAM_ADDR,
>         + * for readback this is where the readback data is stored.
>         + * @param dst_len Typically the Length of bitstream in dwords,
>         or the number of
>         + * readback words expected. The user must check this value for
>         correctness.
>         + *
>         + * @retval 0 Transfer was started.
>         + * @retval -1 src_len or dst_len invalid.
>         + * @retval -2 The DMA queue was full.
>         + */
>         +static int pcap_dma_xfer(
>         +  volatile zynq_devcfg_regs *regs,
>
>
>     There is only instance of these registers and they fixed in the
>     address space, so why have the argument? It is not like we have a
>     1..n instances and need to vary the address.
>
>
> Good point, I'll go through and remove zynq_devcfg_regs as arguments.
>
>
>
>
>         +  uint32_t                  *src,
>         +  size_t                     src_len,
>         +  uint32_t                  *dst,
>         +  size_t                     dst_len
>         +)
>         +{
>         +#ifdef ZYNQ_DEVCFG_DEBUG
>         +  printf( "DMA TRANSFER REQUESTED:\n" );
>         +  printf( "Source: %p\n", src );
>         +  printf( "Source length: %zu\n", src_len );
>         +  printf( "Destination: %p\n", dst );
>         +  printf( "Destination length: %zu\n", dst_len );
>         +#endif /* ZYNQ_DEVCFG_DEBUG */
>         +
>         +  if ( ZYNQ_DEVCFG_DMA_SRC_LEN_LEN( src_len ) != src_len )
>         +    return -1;
>         +  if ( ZYNQ_DEVCFG_DMA_DEST_LEN_LEN( dst_len ) != dst_len )
>         +    return -1;
>         +
>         +  /* Check if the command queue is full */
>         +  if ( ZYNQ_DEVCFG_STATUS_DMA_CMD_Q_F( regs->status ) )
>         +  {
>         +    WARN( "Zynq DMA queue full\n" );
>         +    return -2;
>         +  }
>         +
>         +  /* Order is important */
>         +  regs->dma_src_addr = (uint32_t)src;
>         +  regs->dma_dst_addr = (uint32_t)dst;
>         +  regs->dma_src_len = ZYNQ_DEVCFG_DMA_SRC_LEN_LEN( src_len );
>         +  regs->dma_dest_len = ZYNQ_DEVCFG_DMA_DEST_LEN_LEN( dst_len );
>         +
>         +  return 0;
>         +}
>         +
>         +static int pcap_dma_xfer_wait_and_check(
>         +  volatile zynq_devcfg_regs *regs
>
>
>     Same.
>
>         +)
>         +{
>         +  uint32_t          int_sts;
>         +  rtems_status_code status;
>         +
>         +  /* NOTE: The ISR will handle acknowledging the transfer. */
>         +  regs->int_mask &= ~ZYNQ_DEVCFG_INT_DMA_DONE_INT;
>         +  status = rtems_semaphore_obtain(
>         +    data.sem_id_int_wait,
>         +    RTEMS_WAIT,
>         +    INT_TIMEOUT
>         +  );
>         +  if ( RTEMS_SUCCESSFUL != status )
>         +  {
>         +    regs->int_mask |= ZYNQ_DEVCFG_INT_DMA_DONE_INT;
>         +    WARN( "DMA timed out\n" );
>         +    return -1;
>         +  }
>         +
>         +  /* TODO: More descriptive error handling */
>
>
>     Yes please. We need more than -1 and -2.
>
>
>         +  int_sts = regs->int_sts;
>         +  if (
>         +    ZYNQ_DEVCFG_INT_AXI_WERR_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_AXI_RTO_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_AXI_RERR_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_RX_FIFO_OV_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_DMA_CMD_ERR_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_DMA_Q_OV_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_P2D_LEN_ERR_INT_GET( int_sts )
>         +    || ZYNQ_DEVCFG_INT_PCFG_HMAC_ERR_INT_GET( int_sts )
>         +  )
>         +    return -2;
>         +
>         +  return 0;
>         +}
>         +
>         +/**
>         + * @brief Configure the PCAP controller.
>         + */
>         +static void pl_init(
>         +  volatile zynq_devcfg_regs *regs
>         +)
>         +{
>         +  regs->ctrl = ZYNQ_DEVCFG_CTRL_PCAP_MODE( 1 ) |
>         +      ZYNQ_DEVCFG_CTRL_PCAP_PR( ZYNQ_DEVCFG_CTRL_PCAP_PR_PCAP ) |
>         +      ZYNQ_DEVCFG_CTRL_RESERVED_BITS | regs->ctrl;
>         +  /* Disable loopback */
>         +  regs->mctrl = ZYNQ_DEVCFG_MCTRL_SET(
>         +    regs->mctrl,
>         +    ~ZYNQ_DEVCFG_MCTRL_INT_PCAP_LPBK( 1 ) & regs->mctrl
>         +  );
>         +  /* Clear all interrupts */
>         +  regs->int_sts = ZYNQ_DEVCFG_INT_ALL;
>         +
>         +  if ( !data.secure )
>         +  {
>         +    if ( ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( regs->ctrl ) )
>         +      regs->ctrl = ( ~ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( 1
>         ) & regs->ctrl )
>         +          | ZYNQ_DEVCFG_CTRL_RESERVED_BITS;
>         +  }
>         +  else
>         +  {
>         +    if ( !ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( regs->ctrl ) )
>         +      regs->ctrl = ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( 1 ) |
>         regs->ctrl
>         +          | ZYNQ_DEVCFG_CTRL_RESERVED_BITS;
>
>
>     Have you tested secure loading?
>
>
> No, I have not yet.
>

I am fine with this waiting until after the code has been merged. I can 
then help sort out secure loading. It is tricky to handle as you need a 
suitable key embedded in the device and that means a battery for the 
BBRAM to avoid needing to blow fuses. I have custom boards I can test 
this on.

>
>
>         +rtems_device_driver zynq_devcfg_read(
>         +  rtems_device_major_number  major,
>         +  rtems_device_minor_number  minor,
>         +  void                      *args
>         +)
>         +{
>         +  rtems_libio_rw_args_t *rw_args;
>         +  int                    status;
>         +  rtems_status_code      rstatus;
>         +  rtems_status_code      final_status;
>         +  uint8_t               *data_buf;
>         +
>         +  (void) major;
>         +  (void) minor;
>         +  rw_args = args;
>         +  rw_args->bytes_moved = 0;
>         +
>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>         RTEMS_NO_WAIT, 0 );
>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>         +  {
>         +    final_status = RTEMS_RESOURCE_IN_USE;
>         +    goto err_obtain;
>         +  }
>         +
>         +  if ( rw_args->count < 4 )
>         +  {
>         +    final_status = RTEMS_INVALID_SIZE;
>         +    goto err_insane;
>         +  }
>         +  /* TODO: It might be valid to read configuration registers
>         while the PL is
>         +   * not programmed.
>         +   */
>         +  /* PCFG_DONE must be asserted before readback */
>         +  if ( !ZYNQ_DEVCFG_INT_PCFG_DONE_INT_GET( data.regs->int_sts ) )
>         +  {
>         +    WARN( "read attempted when FPGA configuration not done\n" );
>         +    final_status = RTEMS_IO_ERROR;
>         +    goto err_insane;
>         +  }
>         +
>         +  if ( 0 != (size_t)rw_args->buffer % ZYNQ_DEVCFG_PCAP_DMA_ALIGN )
>         +    data_buf = dma_buf_get( rw_args->count );
>         +  else
>         +    data_buf = (uint8_t *)rw_args->buffer;
>         +
>         +  status = pcap_dma_xfer(
>         +    data.regs,
>         +    (uint32_t *)ZYNQ_DEVCFG_BITSTREAM_ADDR,
>         +    rw_args->count / 4,
>         +    (uint32_t *)( (uint32_t)data_buf | 1 ),
>         +    rw_args->count / 4
>         +  );
>
>
>     What happens if the fabric is loaded with a secured bitfile? Is the
>     hardware nice about you wanting to read a secured image or does it
>     just stop and you need a reset?
>
>     What is the use case for a read? I only ask because you must have
>     loaded the fabric in the first place so your application should have
>     that file somewhere.
>
>
> Now that you mention it, I remember reading something about a lockdown
> related to this, I'll have to check the TRM.
>
> The primary use case for reads that I see is an FPGA scrubber which
> checks and corrects configuration errors. At least in our lab, this is
> the primary use.

Maybe it is simpler to track secure loading and if that happens do not 
allow a read back? I think it is safe to assume this driver is the only 
path to the PCAP.

>
>
>         +  if ( status )
>         +  {
>         +    WARN( "DMA setup FAILED\n" );
>         +    final_status = RTEMS_IO_ERROR;
>         +    goto err_dma;
>         +  }
>         +  else
>         +  {
>         +    status = pcap_dma_xfer_wait_and_check( data.regs );
>         +    if ( status )
>         +    {
>         +      WARN( "DMA FAILED\n" );
>         +      final_status = RTEMS_IO_ERROR;
>         +      goto err_dma;
>         +    }
>         +  }
>         +
>         +  /* Ensure stale data is not read */
>         +  rtems_cache_invalidate_multiple_data_lines( data_buf,
>         rw_args->count );
>         +
>         +  final_status = RTEMS_SUCCESSFUL;
>         +  rw_args->bytes_moved = rw_args->count;
>         +  if ( data_buf != (uint8_t *)rw_args->buffer )
>         +    memcpy( rw_args->buffer, data_buf, rw_args->count );
>         +err_dma:
>         +  if ( data_buf != (uint8_t *)rw_args->buffer )
>         +    dma_buf_release();
>
>
>     dma_buf_release(data_buf) ?
>
>
> :) In retrospect, I'm not sure why I made the DMA buffer pointer global.
>

:)

>
>
>
>         +  rstatus = rtems_semaphore_release( data.sem_id_internal );
>         +err_insane:
>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>         +    final_status = RTEMS_INTERNAL_ERROR;
>         +err_obtain:
>         +  return final_status;
>         +}
>         +
>         +rtems_device_driver zynq_devcfg_write(
>         +  rtems_device_major_number  major,
>         +  rtems_device_minor_number  minor,
>         +  void                      *args
>         +)
>         +{
>         +  rtems_libio_rw_args_t *rw_args;
>         +  int                    status;
>         +  rtems_status_code      rstatus;
>         +  rtems_status_code      final_status;
>         +  uint8_t               *data_buf;
>         +
>         +  (void) major;
>         +  (void) minor;
>         +  rw_args = args;
>         +  rw_args->bytes_moved = 0;
>         +
>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>         RTEMS_NO_WAIT, 0 );
>         +  if ( RTEMS_SUCCESSFUL != rstatus )
>         +  {
>         +    final_status = RTEMS_RESOURCE_IN_USE;
>         +    goto err_obtain;
>         +  }
>
>
>     I suggest a check of the data for a valid series of dummy words, the
>     bus width auto detect and a sync word. This sequence is used by the
>     PCAP loader to sync and align the bit stream to be processed.
>
>     For example here is a function that determines the endian format of
>     the image before it is loaded:
>
>     static bool image_check_endian(const uint32_t* image)
>     {
>       /* Xilinx binary format header */
>       const uint32_t bin_format[] = {
>         0xffffffff, /* Dummy words */
>         0xffffffff,
>         0x000000bb, /* Bus width auto detect, word 1 */
>         0x11220044, /* Bus width auto detect, word 2 */
>         0xffffffff,
>         0xffffffff,
>         0xaa995566, /* Sync word */
>       };
>       size_t i;
>
>       for (i = 0; i < (sizeof(bin_format) / sizeof(uint32_t)); i++) {
>         if (image[i] != bin_format[i]) {
>           return false;
>         }
>       }
>
>       return true;
>
>     }
>
>
> This is a good check to have, however I'll need to check that the
> commands used with the PCAP after a full programming have a similar header.
>

I found the doco. The details are in UG470 
(https://www.xilinx.com/support/documentation/user_guides/ug470_7Series_Config.pdf) 
Section 5.

Note, there are registers not documented here related to secure loading 
that are covered by an NDA. I have refused offers to that documentation. 
Currently the only tool I know that support those registers is `bootgen`.

Also to be full compliant we should detect the byte order and manage the 
swap but this can wait.

>
>
>         +rtems_device_driver zynq_devcfg_control(
>         +  rtems_device_major_number  major,
>         +  rtems_device_minor_number  minor,
>         +  void                      *args
>         +)
>         +{
>         +  rtems_libio_ioctl_args_t *ioctl_args;
>         +  char                     *str;
>         +  int                       status;
>         +  rtems_status_code         rstatus;
>         +  rtems_status_code         final_status;
>         +
>         +  (void) major;
>         +  (void) minor;
>         +  ioctl_args = args;
>         +
>         +  rstatus = rtems_semaphore_obtain( data.sem_id_internal,
>         RTEMS_NO_WAIT, 0 );
>         +  if ( RTEMS_UNSATISFIED == rstatus )
>         +  {
>         +    ioctl_args->ioctl_return = -1;
>         +    return RTEMS_RESOURCE_IN_USE;
>         +  }
>         +  else if ( RTEMS_SUCCESSFUL != rstatus )
>         +  {
>         +    ioctl_args->ioctl_return = -1;
>         +    return RTEMS_INTERNAL_ERROR;
>         +  }
>         +
>         +  final_status = RTEMS_SUCCESSFUL;
>         +  ioctl_args->ioctl_return = 0;
>         +  switch ( ioctl_args->command ) {
>         +  case ZYNQ_DEVCFG_IOCTL_VERSION:
>         +    str = ioctl_args->buffer;
>         +    switch( ZYNQ_DEVCFG_MCTRL_PS_VERSION_GET( data.regs->mctrl
>         ) ) {
>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_1_0:
>         +        strncpy( str, "1.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN );
>         +        break;
>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_2_0:
>         +        strncpy( str, "2.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN );
>         +        break;
>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_0:
>         +        strncpy( str, "3.0", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN );
>         +        break;
>         +      case ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_1:
>         +        strncpy( str, "3.1", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN );
>         +        break;
>         +      default:
>         +        strncpy( str, "???", ZYNQ_DEVCFG_IOCTL_VERSION_MAX_LEN );
>         +        break;
>         +    }
>         +    break;
>         +  case ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_PRE:
>         +    pl_init( data.regs );
>         +    /* Hold FPGA clocks in reset */
>         +    zynq_slcr_fpga_clk_rst( 0xf );
>         +    /* Enable PS to PL level shifters */
>         +    zynq_slcr_level_shifter_enable(
>         ZYNQ_SLCR_LVL_SHFTR_EN_DISABLE );
>         +    zynq_slcr_level_shifter_enable(
>         ZYNQ_SLCR_LVL_SHFTR_EN_PS_TO_PL );
>         +    status = pl_clear( data.regs );
>         +    if ( 0 != status )
>         +    {
>         +      ioctl_args->ioctl_return = -1;
>         +      final_status = RTEMS_UNSATISFIED;
>         +    }
>
>
>     Why are the load phases broken out like this? A use case type of
>     example would be nice.
>
>
> The load phases are broken out to keep the write function generic enough
> to be used after the initial configuration of the FPGA. Here is an
> example of an initial load, minus error checking:
>
> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_PRE );
> write( fd, BITSTREAM_DATA, BITSTREAM_LEN );
> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_WAIT_DONE );
> ioctl ( fd, ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_POST );
>

I am wondering if it makes sense to have a couple of modes for the 
driver. One where a write is synchronous and does the above and one 
where you manually to follow this pattern?

I feel the simplified case is more common than the specialised case so a 
simplified write is more robust. If for some reason this pattern needs 
to change the impact to those who reconfigure or support partial loading 
is smaller.

>
>
>
>         +    break;
>         +  case ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_POST:
>         +    /* Enable all PS-PL level shifters */
>         +    zynq_slcr_level_shifter_enable( ZYNQ_SLCR_LVL_SHFTR_EN_ALL );
>         +    /* Release FPGA clocks from reset */
>         +    zynq_slcr_fpga_clk_rst( 0 );
>         +    break;
>         +  case ZYNQ_DEVCFG_IOCTL_FPGA_PROGRAM_WAIT_DONE:
>         +    data.regs->int_mask &= ~ZYNQ_DEVCFG_INT_PCFG_DONE_INT;
>         +    status = rtems_semaphore_obtain(
>         +      data.sem_id_int_wait,
>         +      RTEMS_WAIT,
>         +      INT_TIMEOUT
>         +    );
>         +    if ( RTEMS_SUCCESSFUL != status )
>         +    {
>         +      ioctl_args->ioctl_return = -1;
>         +      data.regs->int_mask |= ZYNQ_DEVCFG_INT_PCFG_DONE_INT;
>         +      final_status = RTEMS_TIMEOUT;
>         +    }
>
>
>     I am still not seeing the need for asychronous operation.
>
>         +    break;
>         +  case ZYNQ_DEVCFG_IOCTL_SET_SECURE:
>         +    data.secure = *(bool *)ioctl_args->buffer;
>         +    break;
>
>
>     We can take this off line, but it is easier to detect the secure bit
>     in the bitfile and so this call is not needed. We can also not
>     bother to load a secure bitfile if the Zynq is not secure.
>
>     While on the topic of secure modes, data can be decrypt by passing
>     it in and then out of the PCAN. When I tested this a few years ago I
>     found it only worked with the OCM. We should support doing this as
>     it is a requirement on some system that data be secured with the
>     same AES key as the bitfile. I am not sure how this would fit into
>     the device model being used here, I suppose the returning data is
>     buffered until read or the device is closed.
>
>
> OK, I'm open to talking more about secure mode offline. It sounds like
> you've used it extensively.

Great, I have it working on a couple of projects.

>
>
>
>
>         +  case ZYNQ_DEVCFG_IOCTL_BITSTREAM_LEN_GET:
>         +    *(size_t *)ioctl_args->buffer = get_bitstream_len();
>         +    break;
>
>
>     What is purpose of this?
>
>
> I made that function originally when I was checking the length of the
> input data. After I realized I needed to support more than just
> full-bitstream programming, and that the input data was just commands,
> it really isn't that useful anymore, since the user will already know
> the length of their commands anyway. I'll take this out, along with the
> get_bitstream_len function.
>

Thanks.

>
>
>
>         +  default:
>         +    ioctl_args->ioctl_return = -1;
>         +    final_status = RTEMS_INVALID_NAME; /* Maps to EINVAL */
>         +    break;
>         +  }
>         +
>         +  rstatus = rtems_semaphore_release( data.sem_id_internal );
>         +  if ( rstatus != RTEMS_SUCCESSFUL )
>         +    WARN( "Failed to release internal semaphore\n" );
>         +  return final_status;
>         +}
>         diff --git
>         a/c/src/lib/libbsp/arm/xilinx-zynq/include/zynq-devcfg-regs.h
>         b/c/src/lib/libbsp/arm/xilinx-zynq/include/zynq-devcfg-regs.h
>         new file mode 100644
>         index 0000000..d3601d6
>         --- /dev/null
>         +++ b/c/src/lib/libbsp/arm/xilinx-zynq/include/zynq-devcfg-regs.h
>         @@ -0,0 +1,194 @@
>         +/**
>         + * @file
>         + * @ingroup zynq_devcfg
>         + * @brief Device configuration interface register definitions.
>         + */
>         +
>         +/*
>         + * Copyright (c) 2016
>         + *  NSF Center for High-Performance Reconfigurable Computing
>         (CHREC),
>         + *  University of Florida.  All rights reserved.
>         + * Copyright (c) 2017
>         + *  NSF Center for High-Performance Reconfigurable Computing
>         (CHREC),
>         + *  University of Pittsburgh.  All rights reserved.
>         + *
>         + * Redistribution and use in source and binary forms, with or
>         without
>         + * modification, are permitted provided that the following
>         conditions are
>         + * met:
>         + * 1. Redistributions of source code must retain the above
>         copyright
>         + *    notice, this list of conditions and the following disclaimer.
>         + * 2. Redistributions in binary form must reproduce the above
>         copyright
>         + *    notice, this list of conditions and the following
>         disclaimer in the
>         + *    documentation and/or other materials provided with the
>         distribution.
>         + *
>         + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>         CONTRIBUTORS "AS
>         + * IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
>         NOT LIMITED
>         + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
>         + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>         COPYRIGHT OWNER
>         + * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>         INCIDENTAL, SPECIAL,
>         + * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>         LIMITED TO,
>         + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>         DATA, OR
>         + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>         THEORY OF
>         + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>         (INCLUDING
>         + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>         OF THIS
>         + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>         + *
>         + * The views and conclusions contained in the software and
>         documentation
>         + * are those of the authors and should not be interpreted as
>         representing
>         + * official policies, either expressed or implied, of CHREC.
>         + *
>         + * Author: Patrick Gauvin <gauvin at hcs.ufl.edu
>         <mailto:gauvin at hcs.ufl.edu>>
>         + */
>         +
>         +/**
>         + * @defgroup zynq_devcfg_regs Device Configuration Interface
>         Register Definitions
>         + * @ingroup zynq_devcfg
>         + * @brief Device Configuration Interface Register Definitions
>         + */
>         +
>         +#ifndef LIBBSP_ARM_XILINX_ZYNQ_DEVCFG_REGS_H
>         +#define LIBBSP_ARM_XILINX_ZYNQ_DEVCFG_REGS_H
>         +
>         +#include <bsp/utility.h>
>         +
>         +#ifdef __cplusplus
>         +extern "C" {
>         +#endif /* __cplusplus */
>         +
>         +/* Zynq-7000 series devcfg address */
>         +#define ZYNQ_DEVCFG_BASE_ADDR 0xF8007000UL
>         +/* For use with the PCAP DMA */
>         +#define ZYNQ_DEVCFG_BITSTREAM_ADDR 0xFFFFFFFFUL
>         +/* PCAP DMA transfers must be 64-byte aligned */
>         +#define ZYNQ_DEVCFG_PCAP_DMA_ALIGN 64
>         +#define ZYNQ_DEVCFG_INTERRUPT_VECTOR 40
>         +
>         +typedef struct {
>         +  uint32_t ctrl;
>         +#define ZYNQ_DEVCFG_CTRL_FORCE_RST( val ) BSP_FLD32( val, 31, 31 )
>         +#define ZYNQ_DEVCFG_CTRL_FORCE_RST_GET( reg ) BSP_FLD32GET(
>         reg, 31, 31 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_PROG_B_GET( reg ) BSP_FLD32GET(
>         reg, 30, 30 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_PROG_B( val ) BSP_FLD32( val, 30,
>         30 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_POR_CNT_4K_GET( reg )
>         BSP_FLD32GET( reg, 29, 29 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_POR_CNT_4K( val ) BSP_FLD32( val,
>         29, 29 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_PR( val ) BSP_FLD32( val, 27, 27 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_PR_GET( reg ) BSP_FLD32GET( reg,
>         27, 27 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_PR_ICAP ( 0 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_PR_PCAP ( 1 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_MODE( val ) BSP_FLD32( val, 26, 26 )
>         +#define ZYNQ_DEVCFG_CTRL_PCAP_MODE_GET( reg ) BSP_FLD32GET(
>         reg, 26, 26 )
>         +#define ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN( val ) BSP_FLD32(
>         val, 25, 25 )
>         +#define ZYNQ_DEVCFG_CTRL_QUARTER_PCAP_RATE_EN_GET( reg ) \
>         +  BSP_FLD32GET( reg, 25, 25 )
>         +#define ZYNQ_DEVCFG_CTRL_MULTIBOOT_EN( val ) BSP_FLD32( val,
>         24, 24 )
>         +#define ZYNQ_DEVCFG_CTRL_MULTIBOOT_EN_GET( reg ) BSP_FLD32GET(
>         reg, 24, 24 )
>         +#define ZYNQ_DEVCFG_CTRL_JTAG_CHAIN_DIS( val ) BSP_FLD32( val,
>         23, 23 )
>         +#define ZYNQ_DEVCFG_CTRL_JTAG_CHAIN_DIS_GET( reg )
>         BSP_FLD32GET( reg, 23, 23 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_FUSE( val ) BSP_FLD32( val,
>         12, 12 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_FUSE_GET( reg ) BSP_FLD32GET(
>         reg, 12, 12 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_FUSE_BBRAM ( 0 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_FUSE_EFUSE ( 1 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_EN( val ) BSP_FLD32( val, 9, 11 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_EN_GET( reg ) BSP_FLD32GET(
>         reg, 9, 11 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_EN_ENABLE ( 0x3 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_EN_DISABLE ( 0x0 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_AES_EN_LOCKDOWN ( 0x1 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SEU_EN( val ) BSP_FLD32( val, 8, 8 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SEU_EN_GET( reg ) BSP_FLD32GET(
>         reg, 8, 8 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SEC_EN_GET( reg ) BSP_FLD32GET(
>         reg, 7, 7 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SPNIDEN( val ) BSP_FLD32( val, 6, 6 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SPNIDEN_GET( reg ) BSP_FLD32GET(
>         reg, 6, 6 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SPIDEN( val ) BSP_FLD32( val, 5, 5 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_SPIDEN_GET( reg ) BSP_FLD32GET(
>         reg, 5, 5 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_NIDEN( val ) BSP_FLD32( val, 4, 4 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_NIDEN_GET( reg ) BSP_FLD32GET(
>         reg, 4, 4 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DBGEN( val ) BSP_FLD32( val, 3, 3 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DBGEN_GET( reg ) BSP_FLD32GET(
>         reg, 3, 3 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DAP_EN( val ) BSP_FLD32( val, 0, 2 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DAP_EN_GET( reg ) BSP_FLD32GET(
>         reg, 0, 2 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DAP_EN_ENABLE ( 0x3 )
>         +#define ZYNQ_DEVCFG_CTRL_PCFG_DAP_EN_BYPASS ( 0x0 )
>         +#define ZYNQ_DEVCFG_CTRL_RESERVED_BITS ( 0x6000 )
>         +  uint32_t lock;
>         +  uint32_t cfg;
>         +  /* int_sts and int_mask directly overlap, so they share the
>         ZYNQ_DEVCFG_INT_*
>         +   * macros */
>         +  uint32_t int_sts;
>         +  uint32_t int_mask;
>         +#define ZYNQ_DEVCFG_INT_PSS_CFG_RESET_B_INT BSP_BIT32( 27 )
>         +#define ZYNQ_DEVCFG_INT_PSS_CFG_RESET_B_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 27, 27 )
>         +#define ZYNQ_DEVCFG_INT_AXI_WERR_INT_GET( reg ) BSP_FLD32GET(
>         reg, 22, 22 )
>         +#define ZYNQ_DEVCFG_INT_AXI_RTO_INT_GET( reg ) BSP_FLD32GET(
>         reg, 21, 21 )
>         +#define ZYNQ_DEVCFG_INT_AXI_RERR_INT_GET( reg ) BSP_FLD32GET(
>         reg, 20, 20 )
>         +#define ZYNQ_DEVCFG_INT_RX_FIFO_OV_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 18, 18 )
>         +#define ZYNQ_DEVCFG_INT_DMA_CMD_ERR_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 15, 15 )
>         +#define ZYNQ_DEVCFG_INT_DMA_Q_OV_INT_GET( reg ) BSP_FLD32GET(
>         reg, 14, 14 )
>         +#define ZYNQ_DEVCFG_INT_DMA_DONE_INT BSP_BIT32( 13 )
>         +#define ZYNQ_DEVCFG_INT_DMA_DONE_INT_GET( reg ) BSP_FLD32GET(
>         reg, 13, 13 )
>         +#define ZYNQ_DEVCFG_INT_D_P_DONE_INT BSP_BIT32( 12 )
>         +#define ZYNQ_DEVCFG_INT_D_P_DONE_INT_GET( reg ) BSP_FLD32GET(
>         reg, 12, 12 )
>         +#define ZYNQ_DEVCFG_INT_P2D_LEN_ERR_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 11, 11 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_HMAC_ERR_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 6, 6 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_SEU_ERR_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 5, 5 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_POR_B_INT_GET( reg ) BSP_FLD32GET(
>         reg, 4, 4 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_CFG_RST_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 3, 3 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_DONE_INT BSP_BIT32( 2 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_DONE_INT_GET( reg ) BSP_FLD32GET(
>         reg, 2, 2 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_INIT_PE_INT BSP_BIT32( 1 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_INIT_PE_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 1, 1 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_INIT_NE_INT BSP_BIT32( 0 )
>         +#define ZYNQ_DEVCFG_INT_PCFG_INIT_NE_INT_GET( reg ) \
>         +  BSP_FLD32GET( reg, 0, 0 )
>         +#define ZYNQ_DEVCFG_INT_ALL ( 0xf8f7f87f )
>         +  uint32_t status;
>         +#define ZYNQ_DEVCFG_STATUS_DMA_CMD_Q_F( val ) BSP_FLD32( val,
>         31, 31 )
>         +#define ZYNQ_DEVCFG_STATUS_DMA_CMD_Q_F_GET( reg ) BSP_FLD32GET(
>         reg, 31, 31 )
>         +#define ZYNQ_DEVCFG_STATUS_PCFG_INIT_GET( reg ) BSP_FLD32GET(
>         reg, 4, 4 )
>         +  uint32_t dma_src_addr;
>         +  uint32_t dma_dst_addr;
>         +  uint32_t dma_src_len;
>         +#define ZYNQ_DEVCFG_DMA_SRC_LEN_LEN( val ) BSP_FLD32( val, 0, 26 )
>         +  uint32_t dma_dest_len; /* (sic) */
>         +#define ZYNQ_DEVCFG_DMA_DEST_LEN_LEN( val ) BSP_FLD32( val, 0, 26 )
>         +  uint32_t reserved0;
>         +  uint32_t multiboot_addr;
>         +  uint32_t reserved1;
>         +  uint32_t unlock;
>         +  uint32_t reserved2[18];
>         +  uint32_t mctrl;
>         +#define ZYNQ_DEVCFG_MCTRL_PS_VERSION_GET( reg ) BSP_FLD32GET(
>         reg, 28, 31 )
>         +#define ZYNQ_DEVCFG_MCTRL_PS_VERSION_1_0 0x0
>         +#define ZYNQ_DEVCFG_MCTRL_PS_VERSION_2_0 0x1
>         +#define ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_0 0x2
>         +#define ZYNQ_DEVCFG_MCTRL_PS_VERSION_3_1 0x3
>         +#define ZYNQ_DEVCFG_MCTRL_PCFG_POR_B_GET( reg ) BSP_FLD32GET(
>         reg, 8, 8 )
>         +#define ZYNQ_DEVCFG_MCTRL_INT_PCAP_LPBK_GET( reg )
>         BSP_FLD32GET( reg, 4, 4 )
>         +#define ZYNQ_DEVCFG_MCTRL_INT_PCAP_LPBK( val ) BSP_FLD32( val,
>         4, 4 )
>         +#define ZYNQ_DEVCFG_MCTRL_RESERVED_SET_BITS ( 0x800000 )
>         +#define ZYNQ_DEVCFG_MCTRL_RESERVED_UNSET_BITS ( 0x3 )
>         +#define ZYNQ_DEVCFG_MCTRL_SET( reg, val ) ( ( ( reg ) & \
>         +  ~ZYNQ_DEVCFG_MCTRL_RESERVED_UNSET_BITS ) | \
>         +  ZYNQ_DEVCFG_MCTRL_RESERVED_SET_BITS | ( val ) )
>         +  uint32_t reserved3[32];
>         +  uint32_t xadcif_cfg;
>         +  uint32_t xadcif_int_sts;
>         +  uint32_t xadcif_int_mask;
>         +  uint32_t xadcif_msts;
>         +  uint32_t xadcif_cmdfifo;
>         +  uint32_t xadcif_rdfifo;
>         +  uint32_t xadcif_mctrl;
>         +} zynq_devcfg_regs;
>
>
>     <burb> that is a lot of work for accessing a subset of the avaliable
>     registers (I do not like struct mapping to hardware, too many
>     scars). I can only hope these values are correct given the TRM is
>     defined in absolute addresses and offsets.
>
>
> I'll consider moving to the get/set method you proposed for the SLCR
> patch.

Thanks

> The struct mapping bit me a couple times on this project alone.

I have been brought in late to a few projects over the years where this 
was part of the problems. A mix of reasons, hard to audit, inconsistent 
use of volatile, compiler issues, bit fields and the list goes on.

>
>
>         +/** @brief Bitstream lengths in bytes */
>         +enum zynq_devcfg_bitstream_length {
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z007s = 16669920UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z012s = 28085344UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z014s = 32364512UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z010 = 16669920UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z015 = 28085344UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z020 = 32364512UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z030 = 47839328UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z035 = 106571232UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z045 = 106571232UL / 8,
>         +  ZYNQ_DEVCFG_BITSTREAM_LEN_7z100 = 139330784UL / 8
>
>
>     Where did these figures come from?
>
>     I would be careful here. A bitfile is a sequence of words that are
>     interpreted. The hardware has registers and op codes and so the
>     length of data can vary including the number of dummy values, or
>     sync bytes that appear in the bit sequence. For example a secure
>     bitfile has a bit set in one of the internal register that selects
>     the secure mode.
>
>     I will send you some code offline I have that decodes the format.
>     There is a Xilinx UG on the topic, I forget the number.
>
>
> I pulled those numbers from the TRM. They end up being equivalent to the
> length of the BIN-format bitstream that Vivado outputs. I'll be removing
> this as I mentioned above since it no longer has a use.

Thanks.

>
> I'll plan on sending out a v2 patch early next week.
>

Awesome, and thanks for submitting this driver.

> Thanks again for all the feedback,

No problem.

Chris



More information about the devel mailing list