Find records with same string with extra character





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty{ margin-bottom:0;
}






up vote
3
down vote

favorite
1












OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question







New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 1




    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
    – Martin Smith
    11 hours ago



















up vote
3
down vote

favorite
1












OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question







New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 1




    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
    – Martin Smith
    11 hours ago















up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question







New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!







sql-server sql-server-2014 full-text-search string-searching






share|improve this question







New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 14 hours ago









MindXpert

183




183




New contributor




MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






MindXpert is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 1




    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
    – Martin Smith
    11 hours ago
















  • 1




    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
    – Martin Smith
    11 hours ago










1




1




Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
11 hours ago






Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
11 hours ago












1 Answer
1






active

oldest

votes

















up vote
5
down vote



accepted










The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer





















  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
    – MindXpert
    12 hours ago










  • @user1227080 happy to help!
    – sp_BlitzErik
    12 hours ago











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






MindXpert is a new contributor. Be nice, and check out our Code of Conduct.










 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
5
down vote



accepted










The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer





















  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
    – MindXpert
    12 hours ago










  • @user1227080 happy to help!
    – sp_BlitzErik
    12 hours ago















up vote
5
down vote



accepted










The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer





















  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
    – MindXpert
    12 hours ago










  • @user1227080 happy to help!
    – sp_BlitzErik
    12 hours ago













up vote
5
down vote



accepted







up vote
5
down vote



accepted






The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer












The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;






share|improve this answer












share|improve this answer



share|improve this answer










answered 13 hours ago









sp_BlitzErik

20.6k1262102




20.6k1262102












  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
    – MindXpert
    12 hours ago










  • @user1227080 happy to help!
    – sp_BlitzErik
    12 hours ago


















  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
    – MindXpert
    12 hours ago










  • @user1227080 happy to help!
    – sp_BlitzErik
    12 hours ago
















You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
12 hours ago




You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
12 hours ago












@user1227080 happy to help!
– sp_BlitzErik
12 hours ago




@user1227080 happy to help!
– sp_BlitzErik
12 hours ago










MindXpert is a new contributor. Be nice, and check out our Code of Conduct.










 

draft saved


draft discarded


















MindXpert is a new contributor. Be nice, and check out our Code of Conduct.













MindXpert is a new contributor. Be nice, and check out our Code of Conduct.












MindXpert is a new contributor. Be nice, and check out our Code of Conduct.















 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to ignore python UserWarning in pytest?

What visual should I use to simply compare current year value vs last year in Power BI desktop

Script to remove string up to first number