{"id":16,"date":"2019-08-29T21:06:47","date_gmt":"2019-08-29T21:06:47","guid":{"rendered":"http:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/?page_id=16"},"modified":"2020-01-27T16:34:16","modified_gmt":"2020-01-27T16:34:16","slug":"progression-and-team-notes","status":"publish","type":"page","link":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/progression-and-team-notes\/","title":{"rendered":"Notes and Updates"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\"><strong>June 28th, 2019<\/strong><\/h2>\n\n\n\n<p><strong>Goals of the Project:<\/strong><\/p>\n\n\n\n<ul class=\".dropdown-content { display: none; position: absolute; background-color: #f9f9f9; min-width: 160px; box-shadow: 0px 8px 16px rgba(0,0,0,0.2); padding: 12px 16px; z-index: 1; } .dropdown:hover block; wp-block-list\"><li>Have an FPGA do an on-demand image convolution when the PC tells it to&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/li><li>Build a convolution hardware framework using Verilog, then using (low level) C have to PC send commands\/image data to the FPGA<\/li><li>Explore the implementation of neural networks in FPGAs<ul><li>Map out resource usage, latency for deep neural network architectures and hyperparameters<ul><li>Latency refers to the total time (typically expressed in units of \u201cclocks\u201d), required for a single iteration of the algorithm to complete<\/li><\/ul><\/li><li>Demonstrate deep learning techniques in FPGA applications<\/li><\/ul><\/li><\/ul>\n\n\n\n<p><strong>Initial Planning:&nbsp;<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Hardware diagram:<\/strong><ul><li>System, top-level blocks are:<ul><li>PC &amp; FPGA, connected via USB, \u201cserial over USB\u201d<\/li><\/ul><\/li><li>FPGA, top-level blocks might be:<ul><li>Usb_interface<\/li><li>Stream_parser<\/li><li>Line_buffer<\/li><li>Image_data_fetcher<\/li><li>Convo_unit<\/li><li>Output_formatter<\/li><\/ul><\/li><\/ul><\/li><li><strong>Software diagrams:<\/strong><ul><li>Windows application<ul><li>Shows input data, processing (decompression of images, reading convolutional weights from a file, etc), communication via serial read\/write<\/li><li>File formats (e.g. how weights are accessed)<\/li><li>Key data structures<\/li><li>Communications protocol for serial<\/li><\/ul><\/li><\/ul><\/li><li><strong>Verification:<\/strong><ul><li>Bit-accurate emulation of hardware processor in C\/C++, Java or Python, MATLAB, Simulink, etc.<\/li><li>Verilog testbenches for various hardware modules and subsystems (groups of modules) and end-to-end FPGA design verification<\/li><li>System verification plan \u2013 what data will you use, how will you know if you are successful?<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>Who will do what?&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Who will own PC\/Windows software?<ul><li>Paul<\/li><\/ul><\/li><li>Who will own software emulator of hardware?<ul><li>Paul with help from Hussain<\/li><\/ul><\/li><li>Who is in charge of the data transfer protocol (PC to FPGA and vice versa)<ul><li>Ryan<\/li><\/ul><\/li><li>Who will own Verilog architecture \u2013 specification of all top-level blocks (name, pinouts, behavior), and protocol-accurate specification of all interfaces between all top-level blocks<ul><li>Hussain<\/li><\/ul><\/li><li>Who will own design of each top-level Verilog block?<ul><li>Hussain and Ryan<\/li><\/ul><\/li><li>Who will own project management (create and track schedule of tasks and milestones)?<ul><li>Kiera<\/li><\/ul><\/li><li>Who will own the website creation and maintenance?<ul><li>Kiera<\/li><\/ul><\/li><li>Who will take notes from each team meeting (including weekly with me), and publish notes on website?<ul><li>Kiera<\/li><\/ul><\/li><li>Who will manage Git repo, configurations, releases?<ul><li>Hussain<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>July 7th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>USB\/Ethernet<ul><li>Should we stick to USB or would it be worth researching Ethernet as a potential transmission medium?<\/li><\/ul><\/li><li>What is our project going to be?<ul><li>Send convolution operations through c level program, send image data, communicate through USB&nbsp;<ul><li>How much memory do we have on an FPGA?<\/li><\/ul><\/li><li>PC keeps track of data being transmitted, FPGA just runs until it runs out of data<\/li><li>What modules do we need?<ul><li>PC Side &#8212; Host program, written in C, in a linux environment<\/li><li>Data transfer out of and onto the FPGA, PC image parts&nbsp;<\/li><li>Convolution reconstructor?<\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/JDi_BPPcjk6zP8cDw38evo9JW6FuhuvSB7utoHD6XrS4dbuHwPgJlZU-U-qOcQ06Yd2-WnvilNuuigw3kaXaOp4pSKn4MN7nle3VP5AFudrHh1KnmLGJ64fH99dMsHXhTSyY-wGR\" alt=\"\"\/><\/figure>\n\n\n\n<p>Github created: Contains the pseudocode describing the functionality for a Host-FPGA Convolutional Accelerator&nbsp;<br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>July 28th, 2019<\/strong><\/h2>\n\n\n\n<p>Transmission medium should be USB\/UART<\/p>\n\n\n\n<p>Total RAM: 500 kilobytes<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/e-GhW0oIymikCMSF6Ebr1dLFkYhKM1QNPK5ltSwNOPtl194njYZ1P9M3vWHpBvyjlzoCMpRc1TBm6wOYnCAF6RwXb89qLhpX_E2XmsTHtcq5A0wLTkPDfGDvGHv68dSB-QBsUWS5\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/0txc9HOIig_tXUp78CvTmyF90QjnH73s0q9wtrRDzgLtlhDfNnswG7_1w12yxXprd5kYcifzFq1Pl97ePqi1Dgr58Qf7bH2oY05cp4guhm0RvdgO-cGVN2EvBH4MfYoeCBXaESp7\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/O1F66DJ4P7fgWt3bkUNhJczNPdf68wNaKjwlDXla0onAicCCDzLLjS_WAVs74_eejtfiYlRYWSaY5u35McHNYGjTuuiZXoazoBF93aCd5ATs34hWzzCcdCwyoxvlf42KAdqpfa1Z\" alt=\"\"\/><\/figure>\n\n\n\n<p>Requirements (for your customer)<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Image<ul><li>512x512x3 input initial<\/li><li>\u201cSame\u201d setting initial, \u201cvalid\u201d setting stretch<\/li><\/ul><\/li><li>Kernel<ul><li>7x7x3 size initial<\/li><li>Coefficients are Q0.15 fixed point<\/li><\/ul><\/li><li>Output<ul><li>1 feature map initial, n feature maps stretch<\/li><\/ul><\/li><li>UI<ul><li>fileIO user program initial, API stretch<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>Specifications:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Specify hardware, clock rate, BAUD rate ?<\/li><li>USB and serial<\/li><li>R0B0G0 (ex. to send pixels) byte by byte<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>August 29th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Started working on C-based image parser, breaking the image down, streaming image, pseudocode for that<ul><li>POST PSEUDOCODE<\/li><\/ul><\/li><li>How do you drive a UART controller from a c program?<\/li><li>Linux based; tty logical device<\/li><li>Break up image, get raw bytes, distribute those<\/li><li>Using GCC<\/li><\/ul>\n\n\n\n<p>*Hardware complete by the end of the semester<\/p>\n\n\n\n<p>Suggestions from Dr. Pearlstein:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Use 16 bit number for quantization and noise<\/li><li>Forces you to have 4 leading 0\u2019s which you won\u2019t represent\u2026 people use numbers greater than 1<\/li><li>What is the dynamic range? \u2026. you can always scale it finding min and max, preprocess and normalize<ul><li>Won\u2019t scale on FPGA<\/li><li>Avoid saturation to the best of our abilities<\/li><li>Dynamic allocation<\/li><\/ul><\/li><li>Chunky vs Planar?<ul><li>Chunky<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 2nd, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Changing image size to 512x512x3&nbsp;<ul><li>This is what most conv nets use, not bottlenecked by memory<\/li><\/ul><\/li><li>Maintain Kernel size of 7&#215;7<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 4th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Uart block gets us data, with a FIFO<\/li><\/ul>\n\n\n\n<p>High Level Requirements:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>What does the FPGA do?<ul><li>Reads data from the PC<\/li><li>Stores kernel data<\/li><li>Stores part of the image<\/li><li>Compute sum of products (SOP) for image parts kernel<\/li><li>Send result to PC<\/li><\/ul><\/li><li>What does the PC do?<ul><li>Load user image<\/li><li>Send image to FPGA<\/li><li>Receive enddata from FPGA&nbsp;<\/li><li>Present final output to user<\/li><\/ul><\/li><li>Test Plan<ul><li>Send in a 512x512x3 image (the data), and validate the output&nbsp;<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>Bottleneck&#8211;UART<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 5th, 2019<\/strong><\/h2>\n\n\n\n<p><strong>Block Diagrams\/Architecture<\/strong><\/p>\n\n\n\n<p>Level 0: High-level<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/ZtMmadDBghJ1UxhapXPvB4OHk0gDY1VsikUeZ-2h6N5pgFIoLEe3koDM0YvhGqZ_gosWDVJmESWZIjXDY3SNmtBdFbRjuHCuYh-KYx5BEbwME9-Jdy-kqqxcwyjMBJJyNSS35j1C\" alt=\"\"\/><\/figure>\n\n\n\n<p>Level 1: Mid-level<\/p>\n\n\n\n<p>PC:<\/p>\n\n\n\n<p>N memories, n-1 memories (n-1 lines)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/uu51D9sGKv5ZtzfEBiQdtZXkmOBuUxJFLgcAm8tbf7qQOVTUUwMsLwHHF6XnKQGhMifATx2BibO7V6S2kGr8-Wg7Uf6SJrWVykHtejx2gnyBDUrWqeL0eON9--HT5F3U8qG2Odjj\" alt=\"\"\/><\/figure>\n\n\n\n<p>FPGA:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/lf2oVa2T3xIAh8cOWCisLV1_aVv8IHAKyuSAO7uQMEPYxow470z-HQXlv89jmfdGS-0SqJnmjt94Iz4CRhP_gq2gBMJ7iMkBcD1z48kydTdEzlmLApXRV-tAFPNORI4LSDZZlY_u\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 12th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Top level diagram<\/li><li>Drawing that shows connectivity&nbsp;<\/li><li>Create standard naming convention<\/li><li>Prototypes for blocks by next week<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 19th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Successful implementation of a FIFO for both the sending and receiving of the UART module&nbsp;<\/li><li>Simulations looked promising a bitstream to test the design in hardware was created<\/li><li>Implemented Verilog functionality for the image window and sum-of-products multipliers<\/li><li>Working log in google sheets was created and updated<br><br>To be completed for next week:&nbsp;<\/li><li>Module fully tested in hardware to determine the maximum baud rate for the project<\/li><li>Begin working on the rtl for the line buffer using BRAM blocks<\/li><li>Implementing the controller FSM and starting to test each module<\/li><li>Figure out the fixed point scaling during the kernel\/pixel multiplication<\/li><li>IO domain<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>September 26th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Focus on editing the wordpress website, look into add on features the school could offer, focus less on investing in an io domain<br><strong>Image Window\/Test Benches:&nbsp;<\/strong><\/li><li>Write last two test benches, unsigned 8 bit image, CNN are signed (except input which is unsigned)<\/li><li>RTR RTS hardware code complete, tested shift window with high level test bench, make sure it meets timing, shift window meets timing window<br><strong>UART:<\/strong><\/li><li>Stop bit, full bit long, check sender, framing errors, start-stop bit, produce null character<\/li><li>Figure out how to enable serial flow control, 2 way flow control<\/li><li>Try to flow control the modem (FPGA)<\/li><li>Configure the chip<\/li><li>Look to parse CSV file in python&#8230;write in C for speed purposes<\/li><li>Front load progress, aim to get open cv, run storage blocks<br><strong>Other:<\/strong><\/li><li>Add architecture document to the gantt chart<\/li><li>Action items with owner and due date<\/li><li>Specify who works on the architecture document&#8211; get done by this time next week<br><\/li><li>1st draft of test plan on FPGA &#8212; Kiera and Hussain<\/li><li>1st draft of test plan for uart &#8212; Ryan and Paul<br><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>October 3rd, 2019<\/strong><\/h2>\n\n\n\n<p>Current Status: Behind<br><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>RTS- Request to send, transmit side<\/li><li>More block diagrams and images<\/li><li>BRAM block complete, work on testing this<\/li><li>Potentially create timing diagrams, hand draw cycles<\/li><li>Not using much memory<\/li><li>Need to make more progress, particularly on the software side<\/li><li>Block memory generator<\/li><li>Working on the fixed point multiplier for the convolution unit<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"> <strong>October 17th, 2019<\/strong> <\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Figure out the protocol together<\/li><li>Planned approach:<ul><li>BRAM is working<\/li><li>Ran one simulation, BRAM has been filled, writing and reading data at the same time, read and then write on the same cycle ?<\/li><li>Dual port? Assume reading and writing are not occurring simultaneously<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>&#8212; Is cycle rate faster than pixel rate?<\/p>\n\n\n\n<p>&#8212; Limited by the rate of multipliers<\/p>\n\n\n\n<p>&#8211;CNN&nbsp;<br><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Unified state flow<\/li><li>UART values and determine the protocol<\/li><li>Discussed the interfacing<ul><li>Multiple interfaces need to be specified (but might not be necessary)<\/li><li>Always ready to receive, read then set rtr low once value is written to the BRAM<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>State Image: Image to image start up, build up window before you can output any points<\/li><\/ul>\n\n\n\n<p>What needs to be done:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>How the start-up works<\/li><li>Condition the start up behavior<\/li><li>Pull out garbage until you get the data you want<\/li><\/ul>\n\n\n\n<p>Start up behavior &#8212; Hussain<\/p>\n\n\n\n<p>Protocol &#8212; Paul<\/p>\n\n\n\n<p>Architecture<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>ILB, pixel rate, iron out state flow of SOPU (hussain will meet with dr. pearlstein)<\/li><li>UART, website (mr. lee), fit in work on preprocessors, get the protocols done<\/li><li>Hardware set up to send, it, find results live and script that in a few different&nbsp;<\/li><\/ul>\n\n\n\n<p>Serial test bench model<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Write the verilog code to read the image and kernel, send image and kernel data through verilog<\/li><li>Verilog will read CSV file, kernel and image (stored as raw bytes RGB RGB) xnview converter, convert image to raw file<\/li><\/ul>\n\n\n\n<p>*Write down the test plan<br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>October 24th, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Write program to read data from the stream<ul><li>fread to read in the entire file into the array<\/li><li>convert text file<\/li><li>512&#215;512<\/li><li>Matlab to read in files iamread<\/li><li>Take the array that you read and convert to csv file<\/li><li>flatten the array ; ) &#8212; converts any number&nbsp;<\/li><\/ul><\/li><li>Black white pixels, all 0\u2019s, all 255\u2019s, simple patterns<\/li><li>Architecture Document has been updated<ul><li>Now, assume existence of sign bit and use a 9 bit system (even though it is an 8 bit system) c1.0.7<\/li><li>Add picture-with blocks and describe it<\/li><\/ul><\/li><li>The SOPU will be sending data, interaction between the two images<\/li><li>Determine the max rate we can send (make sure to test this)&nbsp;<\/li><li>Autobaud can detect<\/li><li>Make a gif of the window movement &#8211;Kiera<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>October 31st, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Update architecture document<\/li><li>Implement Ryan\u2019s new design and test with Hussain<\/li><li>Serial port is working<\/li><li>Line buffer synchronization<\/li><li>Must get the interface to work<ul><li>(Try to make enhancements)<\/li><\/ul><\/li><li>Create more efficient test benches<\/li><\/ul>\n\n\n\n<p>&#8211;Issue pass\/fail&nbsp;<\/p>\n\n\n\n<p>&#8211;Create random data to test<\/p>\n\n\n\n<p>&#8211;Self checking test<\/p>\n\n\n\n<p>\u201cAutomated environment\u201d&nbsp;<\/p>\n\n\n\n<p>Communication to and from the software<br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>November 7th, 2019<\/strong><\/h2>\n\n\n\n<p>-Continue working on graphics for the website<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Issues<ul><li>No physical hardware issue, as of right now<\/li><li>Not working on the terminal<\/li><\/ul><\/li><li>9600 BAUD<\/li><li>VM of windows&nbsp;<\/li><li>Ryan made a java program for random values<ul><li>Test bench to implement<\/li><li>Read in values and compare against output values<\/li><\/ul><\/li><li>Hussain: wrote all of the RTL&nbsp;<ul><li>Interfacing between the UART and ILB (image line buffer), how data lines up, data distributed approach<ul><li>Pulls the UART, ready to read, latches the byte, master reads the new byte, fills the kernel, when<\/li><li>Gets a pixel from the uart, send&nbsp;<\/li><li>Interface: ILB, immediately take in the data as it comes<\/li><\/ul><\/li><li>One to read and one to write to<\/li><li>Run test synthesis <\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>November 14th, 2019<\/strong><\/h2>\n\n\n\n<p>Self check is finished, random number generator will be used to test<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Error with ILB fixed, memory spaces were too big<\/li><li>SOPU is in the testing phase<\/li><li>Integration and unit testing<\/li><\/ul>\n\n\n\n<p>Work on the documentation<\/p>\n\n\n\n<p>What speed to use for UART?<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>9600, slow, stalls for new data but at least it works<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Port to Windows, having Linux issues<\/li><li>Linux on FPGA, FTDI chip<\/li><li>Get numbers on FPGA utilization, figure out the resources required<br><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>November 21st, 2019<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Animations agreed upon, set to finish before senior project presentations<ul><li>Paul is trying to get serial communication working, still Linux based<\/li><\/ul><\/li><li>Ran program he wrote, would return the character T, or it would time out and not return anything, seemed random<ul><li>Possible explanation: non-blocking will just come back, should not time out, can return null<\/li><\/ul><\/li><li>Need a way to block certain data<\/li><li>Check the results, do nonclocking and see how much valid data is present, reading raw bytes from a byte stream<\/li><li>Submit an abstract<br><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>January 27th, 2020<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Develop a software model for the FPGA<\/li><li>UART protocol complete, can communicate with the PC<\/li><li>C++ has better file io than Verilog so this can be exploited<ul><li>When you do arithmetic operations in Verilog the result will be limited in size<\/li><li>Many advantages to coding in C<\/li><\/ul><\/li><li>Write a tickle script for simulations for wave debugging<\/li><li>Log all signals from the very beginning<\/li><li>Hopefully verify and test, combine ILB and SOPU together to test<\/li><li>Post synthesis, post design<\/li><li>Using different clocks can result in timing violations, if you simulate the chip well enough you can find errors<ul><li>Hassle to figure out what is wrong with the FPGA<\/li><\/ul><\/li><li><strong>Target for next week: C++ software model done, start hardware software post simulation and integrate those, hardware test the ILB, create new rtl to create a direction connection that can read and write the values at the same time, test the hardware<\/strong><\/li><li>Decoupled verification, make a plan <\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>June 28th, 2019 Goals of the Project: Have an FPGA do an on-demand image convolution when the PC tells it to&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Build a convolution hardware framework using Verilog, then using (low level) C have to PC send commands\/image data to the FPGA Explore the implementation of neural networks in FPGAs Map out resource usage, latency &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/progression-and-team-notes\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Notes and Updates&#8221;<\/span><\/a><\/p>\n","protected":false},"author":197,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"ngg_post_thumbnail":0,"footnotes":""},"class_list":["post-16","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/pages\/16","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/users\/197"}],"replies":[{"embeddable":true,"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/comments?post=16"}],"version-history":[{"count":0,"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/pages\/16\/revisions"}],"wp:attachment":[{"href":"https:\/\/engprojects.tcnj.edu\/neuralnetaccelerator\/wp-json\/wp\/v2\/media?parent=16"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}