Excel CSV from column of data

TangoWhiskey

Touchdown! Greaser!
Joined
Feb 23, 2005
Messages
14,210
Location
Midlothian, TX
Display Name

Display name:
3Green
Here's a common work problem I run into. Somebody sends me an Excel spreadsheet, one of the columns in it has a list of several server names:


[ROW][CELL]Server1[/CELL][/ROW]
[ROW][CELL]Server2[/CELL][/ROW]
[ROW][CELL]Server3[/CELL][/ROW]
[ROW][CELL]...[/CELL][/ROW]
[ROW][CELL]Server129[/CELL][/ROW]

I copy that list, put it in Word, and use the advanced find-and-replace tools to translate it into a comma separated quoted list like this:

'Server1', 'Server2', 'Server3', '...', 'Server129'

This I then use in a SQL statement to narrow results:

Code:
SELECT STUFF FROM TABLES WHERE SERVERNAME IN ('Server1', 'Server2', 'Server3', '...', 'Server129')

My question is... does anybody have a macro or a utility that will automate the conversion of an Excel copied range of cells into a quoted list of comma-separated values?

I know you Unix gurus will probably tell me how to do this with sed or awk. I do have Cygwin installed, so I can use that option if there's a way to do it with a multi-line list of values.

What I'd REALLY like is a shell shortcut to "Paste as CSV" that I could use with any clipboard text. :)

P.S.--And again I wonder why the TABLE tag in this forum software inserts so much white space at the top, above the table...
 
Save the excel file as a CSV.

If it looked like this:
Code:
blah,blah,blah,server1
blah,blah,blah,server2
blah,blah,blah,server3
blah,blah,blah,server4
blah,blah,blah,server5
You could do the following with bash, cat, awk, and tr.
Code:
cat test.csv | awk -F, '{print $4}' | tr '\n' ,
Which would output:
Code:
server1,server2,server3,server4,server5
 
Here's one way:

Edit -> Copy the selected range
Open a New worksheet
Edit -> Paste Special and check the Transpose
Save As Comma Seperated Values file Type
 
P.S.--And again I wonder why the TABLE tag in this forum software inserts so much white space at the top, above the table...

Because you pretty-print your entry--- adding a carriage-return after each line. Those CRs are not part of a cell, so they get printed above the table.

You should do it all on one line like this (spaces are added to this example to inhibit the interpreter):

[ table][ row][ cell]Server1[/ cell][/ row][ row][ cell]Server2[/ cell][/ row][ row][ cell]Server3[/ cell][/ row][ row][ cell]...[/ cell][/ row][ row][ cell]Server129[/ cell][/ row][/ table]

Then it would look like this:
[row][cell]Server1[/cell][/row][row][cell]Server2[/cell][/row][row][cell]Server3[/cell][/row][row][cell]...[/cell][/row][row][cell]Server129[/cell][/row]
 
Last edited:
Because you pretty-print your entry--- adding a carriage-return after each line. Those CRs are not part of a cell, so they get printed above the table.

You should do it all on one line like this (spaces are added to this example to inhibit the interpreter):

[ table][ row][ cell]Server1[/ cell][/ row][ row][ cell]Server2[/ cell][/ row][ row][ cell]Server3[/ cell][/ row][ row][ cell]...[/ cell][/ row][ row][ cell]Server129[/ cell][/ row][/ table]

Then it would look like this:
[row][cell]Server1[/cell][/row][row][cell]Server2[/cell][/row][row][cell]Server3[/cell][/row][row][cell]...[/cell][/row][row][cell]Server129[/cell][/row]

Ah, thanks Peggy!!
 
Save the excel file as a CSV.

If it looked like this:
Code:
blah,blah,blah,server1
blah,blah,blah,server2
blah,blah,blah,server3
blah,blah,blah,server4
blah,blah,blah,server5
You could do the following with bash, cat, awk, and tr.
Code:
cat test.csv | awk -F, '{print $4}' | tr '\n' ,
Which would output:
Code:
server1,server2,server3,server4,server5


Thanks Jesse. Will have to read up on the tr command and what it does... I understand the rest of what you wrote.
 
Here's one way:

Edit -> Copy the selected range
Open a New worksheet
Edit -> Paste Special and check the Transpose
Save As Comma Seperated Values file Type

Good trick! Didn't know about the Transpose feature.

I'd give this post the winning vote (do it all from the same tool) if you can tell me how to specify to either (my choice) single or double quote the values...

I'm getting:

server1,server2,server3

Instead of:

'server1','server2','server3'
 
Thanks Jesse. Will have to read up on the tr command and what it does... I understand the rest of what you wrote.

the Tr command just replaces line breaks from awk with commas. If you want single quotes around the values:
Code:
cat test.csv | awk -F, '{print $4}' | tr '\n' , | sed -e 's_,_\",\"_g'
Would give you:
Code:
server1","server2","server3","server4","server5","
Which obviously has a problem on the start and end..Which could be cleaned up if you really want it perfect coming out (btw this is ugly but I don't have the time to come up with something pretty. It works):
Code:
sosimple=`cat test.csv | awk -F, '{print $4}' | tr '\n' , | sed -e 's_,_\",\"_g' | sed 's/.\{2\}$//'` ; echo \"$sosimple
Would output:
Code:
"server1","server2","server3","server4","server5"
I'm more of an awk / sed / bash guy then I am excel. I pretty much don't do anything in excel.
 
the Tr command just replaces line breaks from awk with commas. If you want single quotes around the values:
Code:
cat test.csv | awk -F, '{print $4}' | tr '\n' , | sed -e 's_,_\",\"_g'
Would give you:
Code:
server1","server2","server3","server4","server5","
Which obviously has a problem on the start and end..Which could be cleaned up if you really want it perfect coming out (btw this is ugly but I don't have the time to come up with something pretty. It works):
Code:
sosimple=`cat test.csv | awk -F, '{print $4}' | tr '\n' , | sed -e 's_,_\",\"_g' | sed 's/.\{2\}$//'` ; echo \"$sosimple
Would output:
Code:
"server1","server2","server3","server4","server5"
I'm more of an awk / sed / bash guy then I am excel. I pretty much don't do anything in excel.


Jesse, you rock.
 
How about:
Code:
cat test.csv | awk -F, '{a=a (a ? "," : "") $4} END {print a}'
Or, if you wanted quotes, for some reason, then:
Code:
cat test.csv | awk -F, '{a=a (a ? "," : "") "\""$4"\""} END {print a}'
-harry
 
Here's one way:

Edit -> Copy the selected range
Open a New worksheet
Edit -> Paste Special and check the Transpose
Save As Comma Seperated Values file Type

What he said, except you'd end up with one per line. Load it into word, find and replace CRs(? how you do dat?) with commas. Save as text.

orrr. use Jesse's or Harry's scripts. You can BTW, do that in Windows with a WISH script but it ain't as easy. WSH has regular expressions now.
 
Last edited:
Harry's a power coder... wow. I need to analyze what's happening there with the awk statement after the -F...

P.S.--I love the range of knowledge we have amongst the talented members of this board, and their unending willingness to share it.
 
What he said, except you'd end up with one per line. Load it into word, find and replace CRs(? how you do dat?) with commas. Save as text.

orrr. use Jesse's or Harry's scripts. You can BTW, do that in Windows with a WISH script but it ain't as easy. WSH has regular expressions now.

What he described worked for me, Mike... key is the "transpose" checkbox, which flips that column of data around into a row... csv adds the commas.
 
I need to analyze what's happening there with the awk statement after the -F...
Breaking it down, we have "{ first stuff} END {second stuff}".

awk executes the code in "first stuff" once for each line of input text. Then, once it runs out of text to process, it executes the code in "second stuff" once at the end. In first stuff, we accumulate our output text in a variable called "a", and in "second stuff", we just print out the final value of that variable once at the end, when we're done.

In that first set of brackets, we have "a = a $4". awk does string concatenation implicitly, so "a $4" is just the current value of a concatenated with $4, which is the field in the fourth column of text. So this is the core code that "accumulates" all the $4 fields into the variable a, as it processes lines of input text.

But we also wanted to insert commas between them. However, we don't want extra commas at the beginning or at the end, so what we do is only add a comma before a $4 field if the variable a already has something in it, which means that we're necessarily not on our first SERVER (which is the only one we don't want to precede with a comma).

That's accomplished via our conditional expression:
Code:
(a ? "," : "")
If a evaluates to true (because it's not empty), then this expression evaluates to the thing after the ?, which is "," . If a evaluates to false (because it is empty), then this expression evaluates to the thing after the :, which is "", i.e. nothing. So this whole thing inside parentheses either inserts a comma if a already has something in it, or else inserts nothing if it's empty, which implies that this is our first field we spit out (and thus don't want to precede it with a comma).
-harry
 
Harry, that's just plain awesome, and very simply stated. You a teacher? You should be!! Oh, wait, you ARE! You just taught me.
 
Here's a result working within Excel (attached ZIP file). The file has legitimate Excel macros after you unzip it. Excel format Excel 2002.

To use- select the data you want to convert. It will be in a single column- the program will ignore the additional columns beond the first.

Use menu item Tool, select Macro> and select Macros...

Select the item CombineServers and click Run

Your data will be combined in cell D1. Copy it and paste it where ever you want. It should have all the quotes and commas...

The source code is available in the file and commented so you can easily change it.

Enjoy!
 

Attachments

  • POA CombineStrings.zip
    8.4 KB · Views: 2
Cool, Jack, thanks! This was a fun thread!! I appreciate the range of solutions and methodologies used to present them! THANK YOU ALL!!
 
Cool--Good job Harry. A hell of a lot better then that quick hack I came up with.
 
Cool--Good job Harry. A hell of a lot better then that quick hack I came up with.

Don't tempt me. There are a few perl modules that can grok Office files, meaning you could do the whole thing without running Excel at all. B)
 
While the Unix method is elegant, it doesn't reduce my workload much. Jack's solution is "best" from that standpoint--it uses the same tool I'm already working in.

I modified his code slightly. If you add a project reference to the Microsoft Forms 2.0 library, you can access the DataObject object, which lets you put the assembled string onto the clipboard. So now, I can highlight the list of servers, run the macro, alt-tab over to my SQL editor, and paste in the formatted list into my query:

Code:
Option Explicit
Sub CombineServers()
Const SEPARATOR1 As String = "'" 'charater required around string
Const SEPARATOR2 As String = "," 'character required between string
Const ANSWERROW As Long = 1      'row to store results
Const ANSWERCOL As Long = 4      'column to store results
Dim nNumRows As Long            'number of rows in the selection
Dim nCurRow As Long             'loop control- current row
Dim strData As String           'data read from the cell
Dim strFinishedString As String 'final string for saving
Dim myDataObj As New DataObject

'get the number of rows in the data
nNumRows = Selection.Rows.Count

'null the string, add ' because XL will lose it. XL thinks
' ' character denote a string in a general formatted cell
'strFinishedString = "'"

For nCurRow = 1 To nNumRows
    'add the surrounding separator
    strFinishedString = strFinishedString + SEPARATOR1
    'read the data
    strData = Selection.Cells(nCurRow, 1).Value
    'add it to the string
    strFinishedString = strFinishedString + strData
    'add the surrounding separator
    strFinishedString = strFinishedString + SEPARATOR1
    'if we are getting another cell, add the separator between the values
    If nCurRow < nNumRows Then
        strFinishedString = strFinishedString + SEPARATOR2
    End If
'let windoze do something else for awhile. Also let XL respond to events
DoEvents
Next nCurRow

myDataObj.SetText strFinishedString
myDataObj.PutInClipboard

' ActiveSheet.Cells(ANSWERROW, ANSWERCOL).Value = strFinishedString
End Sub
 
Cool- I wondered how to get something in the clipboard from VBA. Thanks much!

If you use ActiveX Data Objects you can run SQL from XL. I'm no expert in this but I am doing using ADO for a project at work with Excel.
 
Heck, I'll even request an ENHANCEMENT! :)
I have TWO columns that need to be concatenated together this way. And, for extra credit, put a constant string before and after the list. So, with three rows containing 8 columns, and we want the 6th and 7th columns in our list, we want output of
select * from tbl where tbl.key in ("a6","a7","b6","b7")
The file to be parsed is on a NAS head that can be accessed from either Windows or Linux.

Taking Harry's script (Great job, BTW, Harry:yes:), I would get something like:
Code:
cat test.csv | awk -F, '{a=a (a ? "," : "") "\""$6"\"\,"\""$7"\""
} END {print "select \* from tbl where tbl\.key in \(" a "\)"'
Of course, I'd have to save the spreadsheet as an XML file first.

Wait, though! When I originally created the spreadsheet, it was actually an XML file with an .xls extension! If, after modification by the end users in Excel, it's STILL XML, then I could do something to actually parse the XML! Hmmm... I'll investigate this further!


Mike, if you wanted to do this in Perl! Nah, probably couldn't be done!:goofy:
 
Code:
 ___                       _         _                            _ 
(  _`\                    ( )       ( )_           /'\_/`\       ( )
| ( (_) _ __   __     __  | |/')    | ,_)   _      |     |   __  | |
| |___ ( '__)/'__`\ /'__`\| , <     | |   /'_`\    | (_) | /'__`\| |
| (_, )| |  (  ___/(  ___/| |\`\    | |_ ( (_) )   | | | |(  ___/| |
(____/'(_)  `\____)`\____)(_) (_)   `\__)`\___/'   (_) (_)`\____)(_)
                                                                 (_)
 
Heck, I'll even request an ENHANCEMENT! :)
I have TWO columns that need to be concatenated together this way. And, for extra credit, put a constant string before and after the list. So, with three rows containing 8 columns, and we want the 6th and 7th columns in our list, we want output of
select * from tbl where tbl.key in ("a6","a7","b6","b7")
The file to be parsed is on a NAS head that can be accessed from either Windows or Linux.

<SNIP>

The attached zip file contains an excel 2002 file with macors that accomplished the above task.

To run, select menu item Tools followed by Macros> followed by
Macros... Select CombineColumns and click the Run button.

Enter the prefix and suffix text, change your quote and separacter characters if desired, and choose up to 3 columns to combine. Columns can be in any order, just use the column letter to identify the column. At least one column must be used in the left textbox.

Click the Run button- the output is in the edit box at the botto of the dialog- copy this data and paste it whereever you need it.

The program assumes no header row(s) (data starts at row 1). The program goes down until there is no more data in the first column to be concatenated (assumes all columns have the same number of rows). I don't do error checking so it is possible to run past the column limit for Excel.

The program assumes you have entered something in the prefix and suffix- if nothing were entered, it will put empty strings at the beginning and end of the output. These are easily removed from the output. Fixing this and adding error checking is left as an excercise for the student.

Enjoy!

PS- what's an NAS head?
 

Attachments

  • POA Combinestrings1.zip
    14 KB · Views: 2
Here's a version with some error handling,the choice to set the starting row in case there is a header, and if the prefix and suffix are empty, it will not add an empty string to the front and end of the list.

I dislike unfinished business....

I couldn't figure out how to remove the file above without deleting the entire message.

Enjoy!
 

Attachments

  • POA Combinestrings1.zip
    15.7 KB · Views: 3
...Mike, if you wanted to do this in Perl! Nah, probably couldn't be done!:goofy:

"There's more than one way to do it."
http://cpan.uwinnipeg.ca/search?query=Excel&mode=module

I've done a lot of Excel VBA a few years back. The biggest problem I have is with Excel crashing with memory leaks (in my code :p) and corrupting the sheet and code. Since I can't convince the company I deserve exotic hardware - like a functioning laptop - I stopped volunteering to take on such tasks.
 
Enjoy!

PS- what's an NAS head?
Thanks! I'll try it next week, when I have this to process again!

A NAS head is what we call Network Attached Storage that is simultaneously mounted to multiple Linux and Windows boxes. That may not be the correct technical term, of course! :) We use it for securely sharing files between the servers.
 
Thanks! I'll try it next week, when I have this to process again!

Grant- try the one in post #26 of this thread. If you don't enter a constant string for use before or after the list, it won't give empty strings (pairs of quotes) before and after the finished list. It also lets you choose the starting row in case there is a header row- the one in the earlier post will force you to delete this row or change a constant in the code.
 
Back
Top