SQL help, parsing a field
AsSiMiLaTeD
Posts: 11,725
I need some help parsing a field in SQL, hoping some here has done this before. I have a field separated by periods, and I need each section parsed into a separate field. The trick is that the number of characters in each section can differ, so I can't use the substring function. here's an example
Raw fields:
****.1234.123456
****.123456789.12345
****.12345.1234
would need to look like this when parsed.
field1 field2 field3
**** 1234 123456
**** 123456789 12345
**** 12345 1234
Anyone have anything that can do that Logic?
Raw fields:
****.1234.123456
****.123456789.12345
****.12345.1234
would need to look like this when parsed.
field1 field2 field3
**** 1234 123456
**** 123456789 12345
**** 12345 1234
Anyone have anything that can do that Logic?
Post edited by AsSiMiLaTeD on
Comments
-
What platform is your DB server on? Sometimes it's much easier to do things outside of SQL.
That being said, let me think for a few...
Also, what's the database flavor? Is the string always a triplet? -
SQL Server 2005. The field is always a triplet, but there are a couple other possibilities I should mention:
null
****..
****.0.0 -
I agree w/ Vschneider- for example SAS has a function for something like that, so I could just use this:
scan(var,1,".") would grab the first field delimited by a period.
W/ MS SQL Server, you could probably do it w/ a combination of CHARINDEX and SUBSTRINGGallo Ref 3.1 : Bryston 4b SST : Musical fidelity CD Pre : VPI HW-19
Gallo Ref AV, Frankengallo Ref 3, LC60i : Bryston 9b SST : Meridian 565
Jordan JX92s : MF X-T100 : Xray v8
Backburner:Krell KAV-300i -
Here's an example for Sybase (my platform of choice), by MSSQL 2005 will be similar:
1> declare @str_length int, 2> @dot_index int, 3> @dot_char char(1), 4> @string varchar(255), 5> @field1 varchar(255) 6> 7> select @string = '****.123456789.12345', @dot_char = '.' 8> print 'input string = "%1!", field separator = "%2!"', @string, @dot_char 9> 10> select @str_length = char_length(@string), 11> @dot_index = charindex(@dot_char,@string) 12> 13> print 'str_length = %1!, separator index = %2!', @str_length, @dot_index 14> 15> select @field1 = substring(@string, 1, @dot_index-1), 16> @string = substring(@string, @dot_index+1, @str_length) 17> 18> print 'field1 = "%1!", parsed string = "%2!"', @field1, @string 19> go input string = "****.123456789.12345", field separator = "." str_length = 19, separator index = 4 field1 = "****", parsed string = "123456789.12345" 1>
Check the print statements in the code and the results. It works for ".123456789.12345" also, returning "" and "123456789.12345".
- Val -
Did you try LTRIM,RTRIM?
-
LTRIM/RTRIM only remove leading or trailing spaces, AFIK
EDIT: BTW, I now have a looping version of that routine with some error checking to take care of the empty string case, too, but I think you got the idea.
If the data is simply a column in a MSSQL table, you can write a cursor to process the entire table. That won't be pretty but will get the job done.
- Val -
Ok.
Field1 would be:
SUBSTRING(RAW,1,CHARINDEX(".",RAW,1)-1)
Temp1 would be:
SUBSTRING(RAW,CHARINDEX(".",RAW,1)+1)
Field2 would be:
SUBSTRING(Temp1,1,CHARINDEX(".",Temp1,1)-1)
Temp2:
SUBSTRING(Temp1,CHARINDEX(".",Temp1,1)+1)
Field3 would be:
SUBSTRING(Temp2,1,CHARINDEX(".",Temp2,1)-1)
You can nest them to get rid of the temp vars. I'm assuming that if length is left off SUBSTRING it'll take the rest of the string from the starting point (not sure if that's true on your platform)... and you'll have to deal w/ your special cases.Gallo Ref 3.1 : Bryston 4b SST : Musical fidelity CD Pre : VPI HW-19
Gallo Ref AV, Frankengallo Ref 3, LC60i : Bryston 9b SST : Meridian 565
Jordan JX92s : MF X-T100 : Xray v8
Backburner:Krell KAV-300i -
errr... That's exactly what the code above does with some print statements to show how it is working.
-
My bad
Please try this and see if it works.
CREATE TABLE #Table1 (Col1 varchar(30))
INSERT INTO #Table1 (Col1) Values ('****.1234.123456')
INSERT INTO #Table1 (Col1) Values ('****.123456789.12345')
INSERT INTO #Table1 (Col1) Values ('****.12345.1234')
select
rtrim(replace(parsename(replace(Col1, '.', ' '), 1), '.', ' ')) as col1
from
#Table1
DROP TABLE #Table1 -
Cool, this compiles in MS SQL 2005, and returns the original string with spaces in place of dots. But how about separating the Col1 into individual fields?
-
VSchneider wrote: »errr... That's exactly what the code above does with some print statements to show how it is working.
heh. oops, I totally didn't read that mess of code. Looks like you do have to give substring a length.Gallo Ref 3.1 : Bryston 4b SST : Musical fidelity CD Pre : VPI HW-19
Gallo Ref AV, Frankengallo Ref 3, LC60i : Bryston 9b SST : Meridian 565
Jordan JX92s : MF X-T100 : Xray v8
Backburner:Krell KAV-300i -
VSchneider wrote: »Cool, this compiles in MS SQL 2005, and returns the original string with spaces in place of dots. But how about separating the Col1 into individual fields?
That means rtrim doesn't work in 2005 I think. I do not have 2005 to test with. There might be some other way. -
I'll take a look at you guys' posts and report back, thanks
-
Just to show why you should consider doing the work in other than SQL environment...here's C# code that takes in a string and separates it into substrings based on regular expression, in this case ".". Live's a little easier when you have more powerful tools to work with than just SQL.
using System.Text.RegularExpressions; String[] GetFieldsSeparatedByPeriod(String Input) { Regex re = new Regex("."); return re.Split(Input); }
-
Saw it online and I thought you might get an idea.
declare @string varchar(500)
set @string = '****.12345.1234'
declare @pos int
declare @piece varchar(500)
-- Need to tack a delimiter onto the end of the input string if one doesn't exist
if right(rtrim(@string),1) <> '.'
set @string = @string + '.'
set @pos = patindex('%.%' , @string)
while @pos <> 0
begin
set @piece = left(@string, @pos - 1)
-- You have a piece of data, so insert it, print it, do whatever you want to with it.
print cast(@piece as varchar(500))
set @string = stuff(@string, 1, @pos, '')
set @pos = patindex('%.%' , @string)
end -
I would recommend doing the parsing in something other than SQL as this doesn't lend itself very well to SQL. Perhaps C#?Just to show why you should consider doing the work in other than SQL environment...here's C# code that takes in a string and separates it into substrings based on regular expression, in this case ".". Live's a little easier when you have more powerful tools to work with than just SQL.
using System.Text.RegularExpressions; String[] GetFieldsSeparatedByPeriod(String Input) { Regex re = new Regex("."); return re.Split(Input); }
You could stick with Regular Expressions and do this:using System.Text.RegularExpressions; String[] GetFieldsSeparatedByPeriod(String Input) { Regex re = new Regex("\\."); return re.Split(Input); }
or go the easy route:string[] Str = MyString.Split('.');
Not sure which method is faster as I believe the Split member of String uses Regular Expressions but I'm not 100% sure. -
Kris, you're correct about period, it is one of the metacharacters. The reason I used Regular Expressions is because it can do much more than just splitting. There's really only one extra line to use versus the string method so it's not anymore complicated.
Regex re = new Regex("\\."); String[] str = re.Split(MyString);