跳到主要內容

Upload large files to web api using resumable.js


最近工作剛好遇到大檔案(> 2GB)上傳的需求,大檔案的上傳方式與平常處裡小檔案上傳當方式有些不同,對於大檔案上傳我們需要額外考量三個問題 :
1. 上傳逾時。
2  IIS 不支援超過2BG 的檔案上傳
3. 檔案在上傳過程中中斷。
上述這三個問題可以透過multipart/form-data和chuck 的方式,將檔案分割成無數個chunk,且給予每個chunk編號,如果傳送失敗就重新傳送chunk ,進而實現斷點續傳的功能。

目前市面三個有支援大檔案與斷點續傳的JavaScript Library分別有Resumable.jsDropzone.js、jQuery Ajax File Upload,且它們底層皆是使用HTML5 File API,此外它們的作者都有提供Server Side Implementation Sample,而且連TypeScript的定義檔都有,真的都是佛心來著。

接著我們開始介紹如何使用Resumable.js 上傳檔案至Web API(.NET) :
Server Side (Web API)的主要流程:
1. 首先,在Web API需要使用MultipartFormDataStreamProvider來取得每一次Resumable.js 所傳過來的chunk,而chunk由FormData和FileData組成,其中FormData記載Resumable.js所定義的chuck欄位,而FileData則存放chunk的檔案內容。
2. 接著從FormData得到這些Resumable.js所定義的chuck欄位resumableFilename、resumableIdentifier、resumableTotalChunks、resumableChunkNumber欄位,並且根據這些欄位我們將chunk儲存在Upload 目錄下。
3. 等待所有chunk都到位,將合併這些chunk並組成完整的檔案。

FileStorageController Class

    [RoutePrefix("api/FileStorage")]
    public class FileStorageController : ApiController
    {
        private Resumable _Resumable = new Resumable(HttpContext.Current.Server.MapPath("~/Upload"));

        [Route("Upload"), HttpOptions]
        public object UploadFileOptions()
        {
            return Request.CreateResponse(HttpStatusCode.OK);
        }

        [Route("Upload"), HttpGet]
        public object Upload(int resumableChunkNumber, string resumableIdentifier)
        {
            return _Resumable.ChunkIsHere(resumableChunkNumber, resumableIdentifier) ? Request.CreateResponse(HttpStatusCode.OK) : Request.CreateResponse(HttpStatusCode.NoContent);
        }

        [Route("Upload"), HttpPost]
        public async Task<object> Upload()
        {

            try
            {
                if (!Request.Content.IsMimeMultipartContent())throw new HttpResponseException(HttpStatusCode.UnsupportedMediaType);            
                string root = _Resumable.GetRoot();
                if (!Directory.Exists(root)) Directory.CreateDirectory(root);
                var provider = new MultipartFormDataStreamProvider(root);
                if (await ReadPart(provider))
                {
                    ResumableConfig configuration = _Resumable.GetUploadConfiguration(provider);
                    return Request.CreateResponse(HttpStatusCode.OK);
                }
                else
                {
                    var message = _Resumable.DeleteInvalidChunkData(provider) ? "Cannot read multi part file data." : "Cannot delete temporary file chunk data.";
                    return Request.CreateErrorResponse(HttpStatusCode.InternalServerError, message);
                }
            }
            catch (Exception ex)
            {
                return Request.CreateErrorResponse(HttpStatusCode.InternalServerError, ex.ToString());
            }
        }

        public async Task<bool> ReadPart(MultipartFormDataStreamProvider provider)
        {
            try
            {
                await Request.Content.ReadAsMultipartAsync(provider);
                ResumableConfig configuration = _Resumable.GetUploadConfiguration(provider);
                int chunkNumber = _Resumable.GetChunkNumber(provider);
                MultipartFileData chunk = provider.FileData[0]; // Only one file in multipart message
                _Resumable.RenameChunk(chunk, chunkNumber, configuration.Identifier);
                _Resumable.TryAssembleFile(configuration);
                return true;
            }
            catch
            {
                throw;
            }
        }
    }
Resumable Class

    public class ResumableConfig
    {
  public int Chunks { get; set; }
        public string Identifier { get; set; }
        public string FileName { get; set; }
        public static ResumableConfig Create(string identifier, string filename, int chunks)
        {
            return new ResumableConfig { Identifier = identifier, FileName = filename, Chunks = chunks };
        }
    }
    public class Resumable
    {
        private string Root { get; set; }
        public Resumable(string root) {
            this.Root = root;
        }

        public string GetRoot() {
            return this.Root;
        }

        public bool DeleteInvalidChunkData(MultipartFormDataStreamProvider provider)
        {
            try
            {
                var localFileName = provider.FileData[0].LocalFileName;
                if (File.Exists(localFileName)) File.Delete(localFileName);               
                return true;
            }
            catch
            {
                return false;
            }
        }

        #region Get configuration
        public ResumableConfig GetUploadConfiguration(MultipartFormDataStreamProvider provider)
        {
            return ResumableConfig.Create(identifier: GetId(provider), filename: GetFileName(provider), chunks: GetTotalChunks(provider));
        }

        public string GetFileName(MultipartFormDataStreamProvider provider)
        {
            var filename = provider.FormData["resumableFilename"];
            return !String.IsNullOrEmpty(filename) ? filename : provider.FileData[0].Headers.ContentDisposition.FileName.Trim('\"');
        }

        public string GetId(MultipartFormDataStreamProvider provider)
        {
            var id = provider.FormData["resumableIdentifier"];
            return !String.IsNullOrEmpty(id) ? id : Guid.NewGuid().ToString();
        }

        public int GetTotalChunks(MultipartFormDataStreamProvider provider)
        {
            var total = provider.FormData["resumableTotalChunks"];
            return !String.IsNullOrEmpty(total) ? Convert.ToInt32(total) : 1;
        }

        public int GetChunkNumber(MultipartFormDataStreamProvider provider)
        {
            var chunk = provider.FormData["resumableChunkNumber"];
            return !String.IsNullOrEmpty(chunk) ? Convert.ToInt32(chunk) : 1;
        }
        #endregion

        #region Chunk methods
        public string GetChunkFileName(int chunkNumber, string identifier)
        {
            return Path.Combine(this.Root, string.Format("{0}_{1}", identifier, chunkNumber.ToString()));
        }

        public void RenameChunk(MultipartFileData chunk, int chunkNumber, string identifier)
        {
            string generatedFileName = chunk.LocalFileName;
            string chunkFileName = GetChunkFileName(chunkNumber, identifier);
            if (File.Exists(chunkFileName)) File.Delete(chunkFileName);
            File.Move(generatedFileName, chunkFileName);

        }

        public string GetFilePath(ResumableConfig configuration)
        {
            return Path.Combine(this.Root, configuration.Identifier);
        }

        public bool ChunkIsHere(int chunkNumber, string identifier)
        {
            string fileName = GetChunkFileName(chunkNumber, identifier);
            return File.Exists(fileName);
        }

        public bool AllChunksAreHere(ResumableConfig configuration)
        {
            for (int chunkNumber = 1; chunkNumber <= configuration.Chunks; chunkNumber++)
                if (!ChunkIsHere(chunkNumber, configuration.Identifier)) return false;
            return true;
        }

        public void TryAssembleFile(ResumableConfig configuration)
        {
            if (AllChunksAreHere(configuration))
            {
                var path = ConsolidateFile(configuration);
                // Rename consolidated with original name of upload
                RenameFile(path, Path.Combine(this.Root, configuration.FileName));
                DeleteChunks(configuration);
            }
        }

        public void DeleteChunks(ResumableConfig configuration)
        {
            for (int chunkNumber = 1; chunkNumber <= configuration.Chunks; chunkNumber++)
            {
                var chunkFileName = GetChunkFileName(chunkNumber, configuration.Identifier);
                File.Delete(chunkFileName);
            }
        }

        public string ConsolidateFile(ResumableConfig configuration)
        {
            var path = GetFilePath(configuration);
            using (var destStream = File.Create(path, 15000))
            {
                for (int chunkNumber = 1; chunkNumber <= configuration.Chunks; chunkNumber++)
                {
                    var chunkFileName = GetChunkFileName(chunkNumber, configuration.Identifier);
                    using (var sourceStream = File.OpenRead(chunkFileName))
                    {
                        sourceStream.CopyTo(destStream);
                    }
                }
                destStream.Close();
            }

            return path;
        }
        #endregion

        public string RenameFile(string sourceName, string targetName)
        {
            targetName = Path.GetFileName(targetName); // Strip to filename if directory is specified (avoid cross-directory attack)
            string realFileName = Path.Combine(this.Root, targetName);
            if (File.Exists(realFileName)) File.Delete(realFileName);
            File.Move(sourceName, realFileName);
            return targetName;
        }
    }
Client Side (TypeScript) 的主要流程:
1. 首先,import Resumable.js。
2. 建立Resumable物件並設定target與chunkSize,其中target需填入file upload url。
3. 註冊fileAdded、complete、progress以及fileSuccess等常用事件。

import * as Resumable from 'resumablejs/resumable.js';
private UploaderOnInit(url:string): void {
    const r = new Resumable({
      target: url,
      chunkSize: 3 * 1024 * 1024, //3 MB
    });
    r.assignBrowse(document.getElementById('Uploader'), false);
    r.assignDrop(document.getElementById('Uploader'));
    r.on('fileAdded', function (file, event) {
      r.upload();
      // to do ...
});
    r.on('complete', function () {
      r.files.pop();
      // to do ...
});
    r.on('progress', function () {
       // to do ...
    });

    r.on('fileSuccess', function (file, message) {
       // to do ...
    });
  }
參考文獻
1. https://github.com/23/resumable.js

留言

這個網誌中的熱門文章

reCAPTCHA v3 的簡單教學

Google於10/29正式發布reCAPTCHA v3 API,該版本中最大的亮點就是透過分析使用者瀏覽網站的行為,以辨識使用者是否為機器人。換句話說,使用者再也不用一直回答問題和不停地點選圖片,所以說以後再也不用聽到客戶抱怨老是點錯圖片,果然科技始於人性啊~ 另外reCAPTCHA v3 的使用方式也非常簡單,只要短短幾個步驟,就能輕鬆地將reCAPTCHA v3運用在專案中。 1. 首先,至  https://www.google.com/recaptcha/admin  替你的網站註冊一個reCAPTCHA。 2. 註冊後,將會得到Site Key和Secret Key,其中Site Key是給前端使用,而Secret Key則是給後端使用。 3. 在前端HTML head的部分加入reCAPTCHA的api.js。 <script src='https://www.google.com/recaptcha/api.js?render=6Lc5vHgUAAAAAKHxlH0FdDJdA2-8yfzpoMDLIWPc'></script> 4. 接著在body的部分呼叫reCAPTCHA API向Google取得token後,再將token送至後端進行驗證,範例如下: <script> grecaptcha.ready(function () { console.log('1. grecaptcha.ready'); console.log('2. grecaptcha.execute("6Lc5vHgUAAAAAKHxlH0FdDJdA2-8yfzpoMDLIWPc", { action: "@Url.Action("VerifyBot", "Account")" })'); grecaptcha.execute('6Lc5vHgUAAAAAKHxlH0FdDJdA2-8yfzpoMDLIWPc', { action: '@

MongoDB: Save Files Using GridFS

過去我們在使用File System,我們必須自己處理備份、複製、擴充的問題;如今我們可以我們可以使用MongoDB作為File DB,它可以利用Replica和Sharing的機制幫助我們解決備份、複製、動態擴充、分散式儲存、自動平衡、故障回復的問題,且效能優於RDBMS。若真要說MongoDB這類NoSql的缺點就是它不能處理Transaction。 在MongoDB中對大於16MB BSON Document(如:圖片、音頻、影片等)是使用GridFS的方式做儲存。 GridFS是一種在MongoDB中存儲大二進製文件的機制,GridFS 會將文件分割成多個Chunk(預設256 KB),而GridFS使用fs.files和fs.chunks等兩個Collection來存儲檔案資料,其中fs.files負責存放文件的名稱、大小、上傳的時間等資訊,而fs.chunks則是負責存放經過分割後的Chunks,其優點是透過分割儲存的方式能夠快速讀取檔案中任何的片段。 fs.files Collection { "_id" : , // 檔案的Unique ID "filename": data_string, //檔案名稱 "length" : data_number, // 檔案大小 "chunkSize" : data_number, // chunk大小,預設256k "uploadDate" : data_date, // 儲存時間 "md5" : data_string //檔案的md5值 } fs.chunks Collection { "_id" : , // 檔案chunk的Unique ID "files_id" : , //對應檔案的Unique ID "n" : chunk_number, // 檔案chunk的數量 "data" : data_binary, // 以二進為儲存檔案 } 在下面例子,我們將簡單地示範使用.NET MongoDB Driver來存取與操作MongoDB的G