ASPN ActiveState Programmer Network  
ActiveState, a division of Sophos
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups
Submit Recipe
My Recipes

All Recipes
All Cookbooks


View by Category

Title: Extract the Korean language (Hangul) codes
Submitter: Jong-Pork Park (other recipes)
Last Updated: 2001/05/30
Version no: 1.0
Category: Miscellaneous

 

4 stars 2 vote(s)


Approved

Description:

This regular expression extracts hangul (Korean Language) codes from web pages.

Usage: Text Source

$hangul = '안녕하세요? 제 이름은 Park, Jong-Pork 입니다.';
@hangul = ($hangul =~ /((?:[\xA0-\xFE]{2})*)/isg); # extract hangul
$hangul = "@hangul";
$hangul =~ s/([\xA0-\xFE]{2})([\x7F-\x9F]{1})/$1/isg; # correct noise
$hangul =~ s/[A-z]{1,}//isg; # remove English
$hangul =~ s/\s*//isg; # remove white space

print $hangul;

The license for this recipe is available here.

Discussion:

This regular expression was used in a Korean-language world wide web search engine.



Add comment

No comments.



Highest rated recipes:

1. Breaking down a URI into ...

2. Finding Palindromes

3. Extracting HTML URL Links

4. Removing dangerous ...

5. Matching Royal Mail ...

6. Finding URLs in text -- ...

7. Validating email ...

8. Validate Domain Names

9. Extract the Korean ...

10. Remove any HTML




Privacy Policy | Email Opt-out | Feedback | Syndication
© 2006 ActiveState Software Inc. All rights reserved.