Using Drive Api / Driveapp To Convert From Pdfs To Google Documents
Solution 1:
You want to convert from PDF files in the folder to Google Documents. PDF files are in a folder of team drive. You want to import converted them to a folder of your Google Drive. If my understanding is correct, how about this method?
For the conversion from PDF to Google Document, it can convert using not only Drive.Files.insert()
, but also Drive.Files.copy()
. The advantage of use of Drive.Files.copy()
is
- Although
Drive.Files.insert()
has the size limitation of 5 MB,Drive.Files.copy()
can use over the size of 5 MB. - In my envoronment, the process speed was faster than
Drive.Files.insert()
.
For this method, I would like to propose the following 2 patterns.
Pattern 1 : Using Drive API v2
In this case, Drive API v2 of Advanced Google Services is used for converting files.
functionmyFunction() {
var sourceFolderId = "/* source folder id */";
var destinationFolderId = "/* dest folder id */";
var files = DriveApp.getFolderById(sourceFolderId).getFiles();
while (files.hasNext()) {
var res = Drive.Files.copy({parents: [{id: destinationFolderId}]}, files.next().getId(), {convert: true, ocr: true});
// Logger.log(res) // If you use this, please remove the comment.
}
}
Pattern 2 : Using Drive API v3
In this case, Drive API v3 is used for converting files. And here, I used the batch requests for this situation. Because the batch requests can use 100 API calls by one API call. By this, the issue of API quota can be removed.
function myFunction() {
var sourceFolderId ="/* source folder id */";
var destinationFolderId ="/* dest folder id */";
var files =DriveApp.getFolderById(sourceFolderId).getFiles();
var rBody = [];
while (files.hasNext()) {
rBody.push({
method: "POST",
endpoint: "https://www.googleapis.com/drive/v3/files/"+ files.next().getId() +"/copy",
requestBody: {
mimeType: "application/vnd.google-apps.document",
parents: [destinationFolderId]
}
});
}
var cycle =100; // Number of API calls at 1 batch request.for (var i =0; i <Math.ceil(rBody.length / cycle); i++) {
var offset = i * cycle;
var body = rBody.slice(offset, offset + cycle);
var boundary ="xxxxxxxxxx";
var contentId =0;
var data ="--"+ boundary +"\r\n";
body.forEach(function(e){
data +="Content-Type: application/http\r\n";
data +="Content-ID: "+++contentId +"\r\n\r\n";
data += e.method +" "+ e.endpoint +"\r\n";
data += e.requestBody ?"Content-Type: application/json; charset=utf-8\r\n\r\n" : "\r\n";
data += e.requestBody ?JSON.stringify(e.requestBody) +"\r\n" : "";
data +="--"+ boundary +"\r\n";
});
var options = {
method: "post",
contentType: "multipart/mixed; boundary="+ boundary,
payload: Utilities.newBlob(data).getBytes(),
headers: {'Authorization': 'Bearer ' +ScriptApp.getOAuthToken()},
muteHttpExceptions: true,
};
var res =UrlFetchApp.fetch("https://www.googleapis.com/batch", options).getContentText();
// Logger.log(res); // If you use this, please remove the comment.
}
}
Note :
- If the number of API calls at 1 batch request is large (the current value is 100), please modify
var cycle = 100
. - If Drive API v3 cannot be used for team drive, please tell me. I can convert it for Drive API v2.
- If the team drive is the reason of issue for your situation, can you try this after it copied PDF files to your Google Drive?
Reference :
If these are not useful for you, I'm sorry.
Solution 2:
You can first of all fetch and store id of all files in a google sheet. Then you can proceed with processing each file normally by using it's id. Then after you have processed them mark that file as processed. And before processing a file check if that file is already processed.
If there are several files then you can also store the row number till where you have processed, next time continue after that.
Then at last create a trigger to execute your function every 10 minutes or so.
By this you can overcome execution time limit for single execution. API request quota and all will not be by-passed by this method.
Post a Comment for "Using Drive Api / Driveapp To Convert From Pdfs To Google Documents"