Wednesday, July 28, 2010

Quick Guide to Using FTP on CentOS 5.4

Okay so I'm assuming like me you have just had a "fun" time installing CentOS 5.4 (in my case on a virtual pc) and now you want to do some ftp file transfers.

Let's imagine that we are trying to connect to "pc01.mywork.com" as user "me02". Let's assume that my work pc requires secure ftp.

First, make sure you have been assigned an ip address.

$ ping pc01.mywork.com

This will send a reply back from the remote pc if successful, and time out / show an error if it fails. If it fails, run this command to renew your ip address:

$ dhclient

Next we want use secure ftp to connect to my machine. We use the command "sftp" rather than just "ftp" for the secure version.

$ sftp me02@pc01.mywork.com

This will prompt you for a password. Next you will see this:

$ sftp>

This allows you to carry out commands on the remote machine. For a fuller list of commands see http://ss64.com/nt/ftp.html and https://shell6.tdl.com/techsupport/ftp.html I'll cover the most basic ones here.
  • $ sftp> ls lists the files in the current directory of the remote computer
  • $ sftp> lls lists the files in the current directory of the local computer
  • $ sftp> cd changes the current directory of the remote computer
  • $ sftp> lcd changes the current directory of the local computer
You can use these commands to navigate the folders in the local and remote computer. Once you are happy with the directories you can use the "get" or "mget" commands to trnasfer one or multiple files.

$ sftp> get "myfile.pdf"

This will transfer the file "myfile.pdf" to the current directory of the local machine.

$ sftp> mget *

This will transfer all the files in the remote directory to the local one.

Similarly, put and mput will transfer files from the local machine to the remote one.




Tuesday, July 13, 2010

VBA - Zipping Files Using Word (Simple Tutorial)

You might have come across this tutorial: http://www.rondebruin.nl/windowsxpzip.htm about zipping files in VBA with the default Windows zipper program. This is a really good example of how to do it within Excel, but you might hit some problems when you start to copy it in Word.

I'll cover a really simple example of how to do it in Word.

Firstly, declare the library that allows us to tell the application to wait or "sleep". This will become important when we are waiting on files to be added to our zip file.

Private Declare Sub Sleep Lib "kernel32.dll" (ByVal dwMilliseconds As Long)

Next declare the following variables:

Dim ShellApplication As Object 'Shell Application that we will use to copy files to the zip folder
Dim ZipFilePath As Array 'A string variable for the file path the zip should be written to
Dim FileNamesArray 'A variant to act as an array to store the file names

We'll poach the NewZip method from the original Excel tutorial, which will create an empty zip file at the location specified by the sPath parameter.

Sub NewZip(sPath)
If Len(Dir(sPath)) > 0 Then Kill sPath
Open sPath For Output As #1
Print #1, Chr$(80) & Chr$(75) & Chr$(5) & Chr$(6) & String(18, 0)
Close #1
End Sub

Great, we're ready to get started with the zipping. Declare a Sub of your choice for the code and lets pretend we have 2 input files for simplicity. This could be passed to the method as a parameter or a dialogue box could be used for selecting them but we will keep it simple for now.

Dim FileNameString As String
FileNameString = "C:\Desktop\InputFile1 , C:\Desktop\InputFile2"
FileNamesArray = Split(FileNames, ",")

So we have split the input string of files names on commas and put each item into the array. Not that FileNamesArray is not actually typed as an Array - it's actually a Variant so we will have to be careful later on when retrieving items from it.

Next we'll define the output destination for the zip file.

FileNameZip = "C:\Desktop\MyZipFile.zip"

Now we are ready to call the NewZip method on the location we just specified.

NewZip (ZipFilePath)

This should create an empty zip file where we told it to. Now we are ready to copy the files to the zip folder.

Remember the Shell Application we defined at the start? We need to initialise it.

Set ShellApplication = CreateObject("Shell.Application")

Now we need to loop over all the items in the FileNamesArray and copy them to the zip folder. We need to declare a counter for selecting items in the array as well as a checker to see that the current item has been copied to the zip file before adding another.

Dim Counter, ItemChecker As Integer
Counter = 0 'array index of the item being copied
ItemChecker = 0 'will be used to check if the current file has been copied

This just leaves looping over the array and copying the files.

For Counter = 0 to UBound(FileNamesArray) 'gets the size of the array
ItemChecker = ItemChecker + 1 'increment the item count
ShellApplication.Namespace(ZipFilePath).CopyHere CStr(FNames(Counter)) 'copy the files to the zip folder

Notice the CStr method call - this parses a Variant to a String. This is very important because the copy will fail otherwise - we didn't specifically define a String Array at the start remember.

Finally, we need to tell the application to wait until the the current file has been copied before moving on to the next one. We can do this by using the Items.Count value which tells use how many items are in the location we pass to it - in this case the ZipFilePath.

Do Until ShellApplication.Namespace(ZipFilePath).Items.Count = ItemChecker
Sleep 500 'go to sleep for a tiny moment
Loop
Next Counter

And hurray! We're done :)

Wednesday, July 7, 2010

iText - Inserting New Pages

We might want to add a new blank page to a document, or import a page from an existing pdf to a new one.

New Pages

Inserting a new blank page using PdfStamper is really easy.

PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("NewPdf.pdf");
stamper.insertPage(1, PageSize.A4); //insert a new blank A4 page

Its important to note that with the PdfStamper the whole Pdf is read in. By using the insertPage method we are inserting as if it was into the original document. There is no need to iterate over pages or anything like that - it's all immediately read in as soon as the stamper is declared.

If you want to stamp content onto it, that's also really easy too:

PdfContentByte overContent = stamper.getOverContent(1);
overContent.beginText();
.. //add content
overContent.endText();

Things get a little bit more complicated when you want to insert selected pages from an existing pdf into a new document.

Importing Pages - PdfStamper

Sometimes you might want to import an existing page from a pdf and insert it into a new pdf document you have are creating. You might think that PdfReader would do all this, but the problem is PdfReader access the content stream of a page, which includes external objects like fonts etc. It is much safer to pass the PdfReader to another class such as the PdfStamper or PdfWriter, so that they can retrieve any resources behind the scenes and return the full pdf page.

To achieve this, we use PdfImportPage which zips up all the necessary resources and returns an object representing an existing pdf page.

PdfReader reader = new PdfReader("OriginalPdf.pdf"); //reads the original pdf
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("StampedPdf.pdf"));
PdfImportedPage page; //writes the new pdf to file

page = stamper.getImportedPage(reader,2); //retrieve the second page of the original pdf
PdfContentByte newPageContent = stamper.getOverContent(1); //get the over content of the first page of the new pdf
newPageContent.addTemplate(page, 0,0); //add the original page as a "template" for the new one with no transformations

You can use transformations to downsize pages, rotate them or whatever you want.

The problem with PdfStamper for importing pages is that it only works for importing pages from one and only one pdf. If you want to import pages from several pdfs then you will need to use a PdfWriter.

Importing Pages - PdfWriter

Remember that when we use PdfWriter, we need to declare a new document to work with. This isn't necessary with PdfStamper.

PdfReader reader = new PdfReader("OriginalPdf.pdf"); //reads the original pdf
Document document = new Document(); //new pdf document
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("ModifiedPdf.pdf")); //writes the new pdf to file

document.open(); //open the document
page = writer.getImportedPage(reader,2); //retrieve the second page of the original pdf
PdfContentByte newPageContent = writer.getDirectContent() //get the original page content
newPageContent.addTemplate(page, 0,0); //add the original page as a "template" for the new one with no transformations

Note that PdfWriter only has one method for accessing a PdfContentByte - getDirectContent() rather than the over and under options of the PdfStamper.

There is a major downside to remember when importing pages this way. All interactive features such as bookmarks, fields and so forth are lost in the process. If you want to retain them, you have to use PdfCopy instead.

In-depth iText information can be found in Bruno Lowagie's excellent book iText In Action http://www.manning.com/lowagie2/

Monday, July 5, 2010

iText - Using Under and Over Content With PdfStamper

Using PdfStamper we can gain access to both the over and under content of a pdf.

Over content - provides a canvas for adding items such as text or graphics on top of the existing content.
Under content - provides a canvas for adding items under the existing content, such as watermarks.

These are accessed as 2 instances of PdfContentByte objects which can be manipulated using the PdfStamper - this is opposed to using just one PdfContentByte to add items. These objects are accessed using 2 methods in the PdfStamper class:

//declarations
...
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("StampedPdf.pdf"));
int pageNumber = 1;
PdfContentByte overContent = stamper.getOverContent(pageNumber); //over content
PdfContentByte underContent = getUnderContent(pageNumber); // under content

If we wanted to add a watermark to every page, it might look like this:

PdfReader reader = new PdfReader("OriginalPdf.pdf"); //reads the original pdf
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("WatermarkedPdf.pdf")); //create a stamper which writes a new pdf to file
PdfContentByte underContent; // PdfContentByte for accessing the under content of the original pdf pages

Image watermarkImage = Image.getinstance("watermark.jpg"); //image to be added as a watermark to the pages
img.setAbsolutePosition(100,300); //set the position of the watermark

int totalPages = reader.getNumberOfPages(); //get the total number of pages in the original pdf

//for every page in the original pdf
for(int currentPage = 1, currentPage < totalPages + 1; currentPage++)
{
underContent = stamper.getUnderContent(currentPage); //get the under content of the current page
under.addImage(watermarkImage); //add the watermark to it
}
stamper.close();

There are some important things to note here. Firstly, you MUST remember to close the stamper, or the outputted pdf will be corrupt once the program has finished.

Secondly, the page orientation can be a problem here, since if you have a pdf with different page orientations, the coordinate system will have to be altered accordingly. I add the image to point (100,300) here, but this would have to be rotated 90 degrees on a landscape page. You can avoid this by calling this method on the stamper before you use it:

stamper.setRotateContents(false);

This stops the coordinate system in the landscape pages from being represented as height * width instead of width * height so essentially adds the image to the landscape page as if it were portrait.

In-depth iText information can be found in Bruno Lowagie's excellent book iText In Action http://www.manning.com/lowagie2/

iText - Reading PDF Documents with PdfReader

PdfReaders can be used for a variety of purposes such as accessing document properties like pdf version and file length as well as page sizes and rotation, bookmark information and metadata.

There standard constructor for PdfReader looks like this:

PdfReader reader = new PdfReader("Test.pdf");

It is worth noting that when manipulating larger pdf documents there is an alternative, memory saving constructor that can be used.

PdfReader reader = new PdfReader(new RandomAccessFileOrArray("Test.pdf"), null);

This reduces the amount of memory used initially, then increases as the pdf is worked with.

Document properties

The main properties you will be interested in will be the number of pages in a document, but you can also query other interesting things like file length and whether the document is encrypted.

int pagesInDocument = reader.getNumberOfPages();
int fileLength = reader.getFileLength();
boolean encrypted = reader.isEncrypted();

Notice that here we read in a pdf file saved on disk. It is also possible to read a pdf buffered in memory.

ByteArrayOutputStream bufferedPdf;
...
//add stuff to the output stream using appropriate writer
...
pdfReader reader = new PdfReader(bufferedPdf.toByteArray()); //read the buffered pdf

This works exactly the same as writing it to disk and reading it in, but saves having to create and delete a file if you were manipulating the pdf before output.

Page Size and Rotation

There are various methods for retrieving interesting information about pdf pages within a document.

Rectangle pageSize = reader.getPageSize(1); //get the page size for the first page
int pageRotation = reader.getPageRotation(1); // get the rotation of the first page
Rectangle pageSizeWithRotation = reader.getPageSizeWithRotation //takes rotation into account

Let's imagine we have a document with 2 pages in it, both A4 (595x842) but the second page is landscape. getPageSize(1) and getPageSize(2) would both return a rectangle with dimensions 595x842. However, getPageSizeWithRotation(1) would return 595x842 and getPageSizeWithRotation(2) would return 842x595 - the same as before but rotated 90 degrees for landscape.

Retrieving Bookmarks

Internal bookmarks are retrieved as a List using the PdfReader.

List bookmarkList = SimpleBookmark.getBookmark(reader);

Metadata

Meta data is set using key-value pairs stored in a map. To access this, we can retrieve this Map and iterate over it.

Map metaDataInfo = reader.getInfo(); // Map of key value pairs for the meta data
String key;
String value;
for (Iterator i = info.keySet().iterator(); i.hasNext();){
key = (string) i.next();
value = (string) info.get(key);
}

You can add to this metadata map using a PdfStamper:
...
//declarations
...
Map info = reader.getInfo();
info.put("Subject", "New Metadata");
stamper.setMoreInfo(info);

In-depth iText information can be found in Bruno Lowagie's excellent book iText In Actionhttp://www.manning.com/lowagie2/