欧美色欧美亚洲高清在线观看,2022国产男女视频

　　let paddedK: [Float] = pad(sequence: kernel, other: x)

　　現(xiàn)在，我們可以建立paddedX和paddedK之間的一個(gè)卷積：

　　最后，卷積的結(jié)果是：

　　// y = [1, 4, 10, 16, 22]

　　Accelerate的卷積

　　如果你想加速卷積處理，你可以使用Accelerate框架提供的vDSP_conv函數(shù)。同樣，我需要處理邊界條件和核反轉(zhuǎn)。這一次，我對(duì)輸入數(shù)組和核換個(gè)零填充的方式。另外，我需要反轉(zhuǎn)核(文檔里有解釋)，否則我得到的是兩個(gè)序列的相關(guān)性。

　　以下是用Accelerate的實(shí)現(xiàn)：

　　import Accelerate

　　let x: [Float] = [1, 2, 3, 4, 5], M = x.count

　　let kernel: [Float] = [1, 2, 3], N = kernel.count

　　let T = N+M-1

　　var res = [Float](repeatElement(0, count: T))

　　let zeros = [Float](repeatElement(0, count: N-1))

　　let newXin = zeros + x + zeros

　　vDSP_conv(newXin, 1, kernel.reverse(), 1, &res, 1, vDSP_Length(T), vDSP_Length(N))

　　對(duì)于這個(gè)很短的輸入序列，你不會(huì)感激Accelerate框架帶來的加速。但如果我創(chuàng)建了100,000個(gè)元素的輸入數(shù)組，并用和之前示例相同的w內(nèi)核進(jìn)行卷積。在我的MacBook Pro上，Swift的實(shí)現(xiàn)需要318 ms,而Accelerate的vDSP_conv方法只要159 ns。

　　Metal的卷積

　　讓我們看一下如何用Metal實(shí)現(xiàn)相同的例子?？?這篇文章學(xué)習(xí)如何配置一個(gè)GPU計(jì)算的Metal項(xiàng)目。

　　在這個(gè)特殊的例子中，我們需要?jiǎng)?chuàng)建3個(gè)Metal紋理(遵守MTLTexture協(xié)議的對(duì)象)：第一個(gè)紋理存儲(chǔ)輸入序列，第二個(gè)紋理存儲(chǔ)核，第三個(gè)紋理存儲(chǔ)最終結(jié)果。

　　以下是創(chuàng)建這些紋理的源代碼：

　　import Metal

　　let paddedX: [Float] = input + [Float](repeatElement(0, count: N-1))

　　let paddedK: [Float] = kernel + [Float](repeatElement(0, count: M-1))

　　let inputTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(with: .r32Float, width: paddedX.count, height: 1, mipmapped: false)

　　inputTextureDescriptor.usage = .shaderRead

　　inTexture = metalContext.device.newTexture(with: inputTextureDescriptor)

　　let region = MTLRegionMake2D(0, 0, paddedX.count, 1)

　　inTexture?.replace(region, mipmapLevel: 0, withBytes: paddedX, bytesPerRow: paddedX.count * sizeof(Float32.self))

　　let kernelTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(with: .r32Float, width: paddedK.count, height: 1, mipmapped: false)

　　kernelTexture = metalContext.device.newTexture(with: kernelTextureDescriptor)

　　let kernelRegion = MTLRegionMake2D(0, 0, paddedK.count, 1)

　　kernelTexture?.replace(kernelRegion, mipmapLevel: 0, withBytes: paddedK, bytesPerRow: paddedK.count * sizeof(Float32.self))

　　let outputTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(with: .r32Float, width: paddedX.count, height: 1, mipmapped: false)

　　outputTextureDescriptor.usage = .shaderWrite

　　outTexture = metalContext.device.newTexture(with: outputTextureDescriptor)

　　executeConvolution()

　　在前面的源代碼里，metalContext是下面的類的一個(gè)實(shí)例：

　　final class MetalContext: NSObject {

　　let device: MTLDevice

　　let commandQueue: MTLCommandQueue

　　let library: MTLLibrary

　　override init() {

　　// Get the device

　　self.device = MTLCreateSystemDefaultDevice()!

　　// Create a command queue

　　self.commandQueue = device.newCommandQueue()

　　// Get the default library

　　self.library = device.newDefaultLibrary()!

　　super.init()

　　}

　　這只是一個(gè)助手類，我通常用來配置一個(gè)Metal棧的主要對(duì)象。

　　最后一個(gè)executeConvolution()方法用來編碼GPU命令：

　　func executeConvolution() {

　　guard let outTexture = self.outTexture else { return }

　　let commandBuffer = metalContext.commandQueue.commandBuffer()

　　let computeCommandEncoder = commandBuffer.computeCommandEncoder()

　　computeCommandEncoder.setComputePipelineState(computePipelineState!)

　　computeCommandEncoder.setTexture(inTexture, at: 0)

　　computeCommandEncoder.setTexture(kernelTexture, at: 1)

　　computeCommandEncoder.setTexture(outTexture, at: 2)

　　computeCommandEncoder.dispatchThreadgroups(MTLSizeMake(T, 1, 1), threadsPerThreadgroup: MTLSizeMake(1, 1, 1))

　　computeCommandEncoder.endEncoding()

　　commandBuffer.commit()

　　let region = MTLRegionMake1D(0, T)

　　var buffer = [Float32](repeatElement(0, count: T))

2/4 首頁(yè) 上一頁(yè) 1 2 3 4 下一頁(yè) 尾頁(yè)

iOS 10和macOS中的卷積神經(jīng)網(wǎng)絡(luò)