I work with LDAP on a regularly basis. I frequently have to pull data using ldapsearch. While the data that ldapsearch spits out is a decent representation sometimes I want something a bit easier to work with.
The format I use most tends to be CSV. While the acronym stands for comma separated value the format may be used to describe data that is printed out line by line and separated by anything you can imagine.
The problem for me is finding a decent LDIF to CSV converter. The ones that I have tried tend to choke on any number of issues. Some of these include binary data or failing to normalize the multivalued attributes.
I finally got sick enough of these issues that I decided to write my own. You can download it here:
Python/Linux/Source
Windows
Update: Pushed the code to GitHub: https://github.com/tachang/ldiftocsv
Hopefully the Windows binary works for you. If it doesn't then just download Python 2.6 and run the source manually. I am providing the binary just for convenience.
Using LDIFtoCSV
Running "python LDIFtoCSV.py" should give you the usage text.
usage: LDIFtoCSV.py [options]
-o <filename> : File to write output. By default this is set to sys.stdout
-l <filename> : File to write logging output. By default there is no logging.
-F <char> : Character to separate the fields by. By default this is a comma. i.e. -F","
-D <char> : Character to delimit the text value by. By default this is a double quote. i.e. -D"""
-M <num> : The maximum number of columns a multivalued attribute should take up (default: 5). This is common with the objectClass attribute where it can have over 20 values. Do you want to have 20 columns each with the same heading objectClass or do you want to limit it.
Here are some common command lines that I use (assuming you have a test.ldif):
Outputs the CSV straight to standard output:
python LDIFtoCSV.py test.ldif
Outputs CSV to standard output with semicolons as the delimiter:
python LDIFtoCSV.py -F"|" -D"^" test.ldif
50 comments:
the website for the csv2ldif seems offline or non existent. did you move it somewhere else ?
Should be back up. If you decide to try the tool out please let me know what you think!
Very useful script - you've just saved me a lot of time. Thanks!
Hello,
Thanks for this great program. Is it possible to include all multiple values for the same LDAP attribute as one csv entry separated by semicolons instead of setting a maximum number and limiting the amount of multiple values stored in the CSV?
Ie -
favorite - "Ice cream"
favorite - "Apples"
In the csv file:
..., "Ice cream";"Apples",....
@Mujtaba - Right now there isn't but that is a great idea for the next version. Thanks!
Any idea on when the next version will be out? I need this functionality for a project. :)
@Mujtaba Right now I do not have any date for a new version. If I need the tool to do additional functionality then I will release a new version. Of course you are free to download the source and modify it to fit your needs. If you are really in a need of customizations feel free to e-mail me directly and we can discuss a nominal rate.
Thank you, I will modify it myself.
I have addressed similar problems in C++. My solution is simplier but could fit the needs of someone who adopts C++ as the language of choice. Here is the link: http://register72.wordpress.com/2009/07/09/ldif-to-tsv-conversion/
This is wonderful, exactly what I need.
Is the .exe and/or source redistributable?
@Kenneth
Not sure about the executable (since there might be some Microsoft DLL stuff in there) but feel free to use the source and downloading a python interpreter.
This is brilliant. Just as you experienced; I could not get any softwarew that coudl do the conversion well but this has save me a lot of hard work.
I hope to see future enhancements; as this is a tool for the future for all that use the LDAP platform.
We used it to convert our 3000 employee LDIF to a spreadsheet, and it worked like a charm. Thanks! Only thing that would make it better would be the ability to either make a connection directly to the LDAP or to accept the input file from stdin so we could pipe the ldapsearch results to it! For a one-time use, though, it's great as is.
Thanks a lot, just converted a LDIF-file containing 72000 entries. It worked well for me.
Excellent script!
Thank you very much for sharing it.
Thank you, Thank you, Thank you!
Worked perfect the first try!
Awesome tool!!! Works like a charm.
This is perfect. I found it after a long struggle with various other tools..
Thanks much!
I am also fed up with all tools that tries to convert ldif to csv, but having tried this one, it suffer from the same two problems of every other tool I used. Both two problems are because most such tools are written by people from the west with little care on ideographs.
First, your tool like other don't distinguis ideograph string from binary pictures. It is difficult to tell ideographs from pictures, despite different attribute type, they all look like binaries on the command line pipe. The application that can correctly write out ideograph to csv/tsv must either allow user to configure which field is a string and which is binary, or, even better, be schema-aware. To be schema-aware the application must act as ldap client, instead of reading from ldif file, because when exported to ldif file the schema is already lost.
Second, it should offer the possibility to only convert one language version to csv/tsv, or offer user to consider multiple language versions as multiple values. Language extension is not much used in the west. Consider this:
line100: cn: Zhang Weiwu
line101: cn;lang-zh: 張韡武
Line 101 is the Chinese spelling of the same name.
It is not possible, as far as I know, to produce ldif with ldapsearch so that only include line100 but not line101, thus tools must be careful when handling such data source.
I know it is difficult to make tools that doesn't suck for all users, but if you can manage that, it is the kind of achievement that many others before you never even manage to get close to.
@Zhang Out of curiosity how does the program fail?
Behavior of your system:
# the blog doesn't accept HTML pre tag. From now on is start of pre
$ ldapsearch_intranet cn=Yu*Liansu givenName sn cn title jobtitle organization organizationalUnit postalcode country st locality street telephonenumber fax labeledURI description | tee /tmp/test.ldif
# extended LDIF
#
# LDAPv3
# base with scope subtree
# filter: cn=Yu*Liansu
# requesting: givenName sn cn title jobtitle organization organizationalUnit postalcode country st locality street telephonenumber fax labeledURI description
#
# yuliansu, contacts, realss.com, eoa.cn
dn: uid=yuliansu,ou=contacts,ou=realss.com,dc=eoa,dc=cn
description: Assistant, started to work in mid-2005
sn:: 5L+e
sn;lang-en: yu
givenName:: 6I6y6IuP
cn:: 5L+e6I6y6IuP
cn;lang-en: yuliansu
givenName;lang-en: liansu
userPassword:: e21kNX1RdmdhTytSZkw5NTB0bUNDM09JamtBPT0=
telephoneNumber: 010 69020190
st:: 5YyX5Lqs
st;lang-en: beijing
street:: 5rW35reA5Yy65LiH5p+z5LiH5rOJ5paw5pawMTHmoIsz6ZeoMTAx5a6k
street;lang-en: haidianquwanliuwanquanxinxin11dong3men101shi
l:: 5YyX5Lqs5biC
l;lang-en: beijingshi
postalCode: 100089
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: evolutionPerson
objectClass: posixAccount
objectClass: shadowAccount
objectClass: phpgwAccount
objectClass: qmailUser
# search result
search: 2
result: 0 Success
# numResponses: 2
# numEntries: 1
$ python /tmp/LDIFtoCSV.py /tmp/test.ldif
"cn","cn;lang-en","description","dn","givenName","givenName;lang-en","l","l;lang-en","objectClass","objectClass","objectClass","objectClass","objectClass","postalCode","result","search","sn","sn;lang-en","st","st;lang-en","street","street;lang-en","telephoneNumber","userPassword",
"'\xe4\xbf\x9e\xe8\x8e\xb2\xe8\x8b\x8f'","yuliansu","Assistant, started to work in mid-2005","uid=yuliansu,ou=contacts,ou=realss.com,dc=eoa,dc=cn","'\xe8\x8e\xb2\xe8\x8b\x8f'","liansu","'\xe5\x8c\x97\xe4\xba\xac\xe5\xb8\x82'","beijingshi","top","person","organizationalPerson","evolutionPerson","posixAccount","100089","","","'\xe4\xbf\x9e'","yu","'\xe5\x8c\x97\xe4\xba\xac'","beijing","'\xe6\xb5\xb7\xe6\xb7\x80\xe5\x8c\xba\xe4\xb8\x87\xe6\x9f\xb3\xe4\xb8\x87\xe6\xb3\x89\xe6\x96\xb0\xe6\x96\xb011\xe6\xa0\x8b3\xe9\x97\xa8101\xe5\xae\xa4'","haidianquwanliuwanquanxinxin11dong3men101shi","010 69020190","{md5}QvgaO+RfL950tmCC3OIjkA==",
"","","","None","","","","","","","","","","","0 Success","2","","","","","","","","",
# end pre tag here
If problem 1 solved, all appearance of things like \x9f\xb should be ideograhs like 莲.
If problem 2 solved, there should be ways to omit every column whose field name is "*;lang-en".
It's great you want to look at this issue. As an asian user I cannot help feeling this:
the world is not better because there are 10+ tools, it would be better if there is only one tool that does things well.
And often I read "I decide to make my own tool" on the web is a creation again ignoring Asian text, not only ldap tools but all tools. You cannot help being critical after all such experiences, even given all tools are granted as free in freedom without charge:)
The download appears not to be working again. Can you check it?
The download links don't appear to be working, fyi. BTW, is your tool the same as the "LDIF2CSV" one on softpedia, wondering if you published it there?
the download isn't working?? do you know of a way where i can convert multiple ldif files 19000 to be precise into csv and load it onto a DB? if anyone has any answers please lemme knw..
I just updated the links. I will be putting it on github shortly.
Thanks very much for your useful tool. If you ever have time to implement Mujtaba's suggestion (collect multi-values into one delimited string), it would be extremely beneficial for my use-case. Cheers :)
Thanks for the very useful tool.
I would like to thank you for the efforts you have made in writing this post. I am hoping
the same best work from you in the future as well.
Frases de Amor
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
Houston Electrician
I would like to thank you for the efforts you have made in writing this post. I am hoping
the same best work from you in the future as well.
Reliable VPN Proxy Service
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
cheap voip call
I would like to thank you for the efforts you have made in writing this post. I am hoping
the same best work from you in the future as well.
compra venta
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
segunda mano
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
vpn liberty reserve
I would like to thank you for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well.
Escorts San Diego
I would like to thank you for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well.
Escorts San Diego
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
used cnc router
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
Electrician Greenville SC
Nice stuff you got, very interesting to read.
Well, I do have also in my sleeves, if you have time don't forget to visit
Greenville air conditioning
Fantastic, it helped me to migrate contacts in LDIF format into GMail!!!
Hello Jeff, sorry to contact you so randomly.. I was seeking some help from online and I believe you might be able to help me. I am currently working on a project where I have to migrate a table from DB2 to LDIF and import into IBM LDAP. Would you please able to share some knowledge on this?
Thanks
Ankur
I think I fixed the utf character problem:
In ldif.py, I added .encode('utf8') to line 325:
attr_value = base64.decodestring(unfolded_line[colon_pos+2:]).decode('utf8')
in LDIFtoCSV.py I changed line 128 to 139 to
if( attributeName in entry ):
i = 0
while( i < numberOfTimesToPrint ):
if( i < len(entry[attributeName])):
self.defaultOutput.write(self.textDelimiter + entry[attributeName][i].encode('utf8') + self.textDelimiter + self.fieldSeparatorCharacter)
else:
self.defaultOutput.write(self.textDelimiter + self.textDelimiter + self.fieldSeparatorCharacter)
i = i + 1
# If the attribute name is dn, print the fully qualified distinguished name
Now this ldif:
dn: CN=MACHINE,OU=Domain Controllers,DC=our,DC=domain
name: MACHINE
operatingSystem:: V2luZG93cyBTZXJ2ZXLCriAyMDA4IFN0YW5kYXJk
Converts correctly to:
Windows Server® 2008 Standard
I meant
In ldif.py, I added .decode('utf8') to line 325:
attr_value = base64.decodestring(unfolded_line[colon_pos+2:]).decode('utf8')
Hi,
I am loooking fro CSV to Ldif conversion,and this is very urgent for me...I see so many tools online and getting confused.
Can you pl. advise?
Perfect !! - Cojonuda !!
It does exactly what I wanted
Great Work, Thanks for providing this Utility, windows utility worked great for me.
Unfortunately, the link to your source no longer appears to be valid. Is there a place where it is available?
Regardless of what stage you are at in your profession movement, we have agreed elite to more readily assist you with planning for your vocation. Professional graphic design
Post a Comment