Jump to content
Sign in to follow this  
Gecko

Algoritmica

Recommended Posts

Am dat peste un challenge interesant mai devreme. Am de reprodus functionalitatea de @mention, regasita pe majoritatea chaturilor, pentru un site la care lucrez, iar la partea de afisare a rezultatelor, am descoperit ca e destul de tricky sa faci highlight pe caracterele gasite, atunci cand nu sunt consecutive.

 

Sa luam ca exemplu lista urmatoare de utilizatori:

 

- Gecko

- Teodor

- Benone

 

Iar ca exemplu de cautare, sa presupunem stringul "eo".

 

Un algoritm simplu, ar cauta dupa acel string cu regex in toata lista si ar returna un singur rezultat:

 

- Teodor

 

E corect rezultatul, insa un user mai avansat s-ar astepta la o cautarea mai complexa. De exemplu, e destul de comun cazul in care se sparge stringul intr-un array de caractere si sa caute dupa ele in ordinea data, mai ales in cazul editoarelor de cod. Astfel incat rezultatele sa fie:

 

- Teodor

- Gecko

- Benone

 

Challenge-ul consta in conceperea unui algoritm care returneaza lista aceasta, cu tot cu highlight-uri.

 

Pentru a face highlight unui caracter trebuie sa folositi tag-urile HTML <mark> inainte si </mark> dupa.

 

Deci, mai precis, algoritmul ar trebui sa afiseze urmatorul rezultat, la cautarea "eo":

 

- T<mark>e</mark><mark>o</mark>dor

- G<mark>e</mark>ck<mark>o</mark>

- B<mark>e</mark>n<mark>o</mark>ne

 

Postati solutia direct aici.

 

Voi posta si eu solutia mea peste cateva zile.

 

Primii care au rezolvat problema:

- @Sim Master

- @AdIntended

-

-

-

  • Upvote 1

Share this post


Link to post
Share on other sites
2 minutes ago, fallen_angel said:

Adică vrei să-ți facem jobul gratis, tu să iei banii. Nice try.

:))

 

m6zqEVk.png

Share this post


Link to post
Share on other sites

Cred ca e stupida metoda mea, dar...

 

var original_text = 'Gecko';
var search_query = 'eo';

var resulted_text = original_text;
for (var i = 0; i < search_query.length; i++) {
    var replace = resulted_text.replace(search_query[i], '<mark>' + search_query[i] + '</mark>');
    
    if (replace != resulted_text) {
        resulted_text = replace;
    } else {
        resulted_text = original_text;
        break;
    }
}

console.log(resulted_text);

Edited by Sim Master

Share this post


Link to post
Share on other sites
2 hours ago, Sim Master said:

Cred ca e stupida metoda mea, dar...

 


var original_text = 'Gecko';
var search_query = 'eo';

var resulted_text = original_text;
for (var i = 0; i < search_query.length; i++) {
    var replace = resulted_text.replace(search_query[i], '<mark>' + search_query[i] + '</mark>');
    
    if (replace != resulted_text) {
        resulted_text = replace;
    } else {
        resulted_text = original_text;
        break;
    }
}

console.log(resulted_text);

AkiCKaU.png

 

Sunt niste cazuri interesante care apar, de-aia am zis ca e tricky. :))

Share this post


Link to post
Share on other sites

Of, futa-l Stalin.


var original_text = 'Admin';
var search_query = 'ain';

var resulted_text = '';
for (var i = 0; i < original_text.length; i++) {
    if (original_text[i].toLowerCase() == search_query[0].toLowerCase()) {
        resulted_text += '<mark>' + original_text[i] + '</mark>';
        search_query = search_query.substr(1);

        if (!search_query.length) {
            resulted_text += original_text.substr(i+1);
            break;
        }
    } else {
        resulted_text += original_text[i];
    }
}

if (search_query.length) {
    resulted_text = original_text;
}

console.log(resulted_text);

Ce alte cazuri pot sa mai apara? :)) 

Share this post


Link to post
Share on other sites
10 hours ago, Sim Master said:

Ce alte cazuri pot sa mai apara? :)) 

O alta problema de care m-am lovit a fost in cazul stringului "abecedar", la cautarea "ce", sa-mi inlocuiasca astfel: abecedar in loc de: abecedar. Dar in cazul tau nu se intampla, pentru ca ai inteles (mai repede decat mine) ca trebuie sa inlocuiesti cate un caracter si sa te descotorosesti de el dupa ce l-ai folosit. Felicitari! E foarte buna abordarea ta.

  • Thanks 1

Share this post


Link to post
Share on other sites

@Sim Master Eram curios daca solutia ta functioneaza mai bine decat a mea, asa ca am facut niste benchmarks pe ambele, plus a ta optimizata un pic.

 

Solutia mea si rezultatele benchmark-urilor sunt in spoiler. Pentru cei ce vor sa participe la challenge, inca pot sa o faca. Voi reveni cu alt post sa includ si algoritmii lor in benchmark.
 

Spoiler

 

1 repetite pe fiecare

7gBT8J9.png

 

10

LtZJ2Vn.png

 

100

0DUbX13.png

 

1000

IJH8CkI.png

 

1 000 000

gU9OuFC.png

 

Solutia mea este bazata pe regex:


const g = (string, term) {
    term = term.trim();

    if (! term) {
        return string;
    }

    const regex = new RegExp(`(${term.split('').join(')(.*?)(')})`, 'i');

    let replacer = '', i;

    for (i = 1; i < (term.length - 1) * 2; i += 2) {
        replacer += '<mark>$' + i + '</mark>$' + (i + 1);
    }

    replacer += '<mark>$' + i + '</mark>';
    
    return string.replace(regex, replacer);
};

 

Codul pentru toate cele 3 variante de solutii testate + benchmark:

 


const { performance } = require('perf_hooks');

class Solution {
    static g (string, term) {
        term = term.trim();

        if (! term) {
            return string;
        }

        const regex = new RegExp(`(${term.split('').join(')(.*?)(')})`, 'i');

        let replacer = '', i;

        for (i = 1; i < (term.length - 1) * 2; i += 2) {
            replacer += '<mark>$' + i + '</mark>$' + (i + 1);
        }

        replacer += '<mark>$' + i + '</mark>';
        
        return string.replace(regex, replacer);
    }

    static simMaster (string, term) {
        term = term.trim();
    
        if (! term) {
            return string;
        }

        let result = '';

        for (let i in string) {
            if (string[i].toLowerCase() === term[0].toLowerCase()) {
                term = term.substr(1);
                result += `<mark>${string[i]}</mark>`;

                if (! term.length) {
                    break;
                }
            } else {
                result += string[i];
            }
        }

        return result;
    }

    static simMasterOptimized (string, term) {
        term = term.trim().toLowerCase();
    
        if (! term) {
            return string;
        }

        const stringLower = string.toLowerCase();
        let result = '';

        for (let i in string) {
            if (stringLower[i] === term[0]) {
                term = term.slice(1);
                result += `<mark>${string[i]}</mark>`;

                if (! term.length) {
                    break;
                }
            } else {
                result += string[i];
            }
        }

        return result;
    }
}

if (process.argv[2] === 'benchmark') {
    let begin, ob = '';

    const string = process.argv[3];
    const term = process.argv[4];
    const reps = process.argv[5] || 1;

    for (let fn of Object.getOwnPropertyNames(Solution)) {
        if (typeof Solution[fn] === 'function') {
            begin = performance.now();
            ob += `${fn}(${string}, ${term}) x ${reps}\n`;

            for (let i = 0; i < reps; i++) {
                Solution[fn](string, term);
            }

            ob += `== ${(performance.now() - begin).toFixed(3)}ms avg\n`;
        }
    }

    console.log(ob);
} else {
    for (let i = 0; i < +(process.argv[5] || 1); i++) {
        Solution[process.argv[2]](process.argv[3], process.argv[4]);
    }
    
    console.log(Solution[process.argv[2]](process.argv[3], process.argv[4]));
}


 

 

  • Like 1

Share this post


Link to post
Share on other sites

Nu-i rau, dar cred ca e loc de mai bine.

 


function sm_v2(string, term) {
    term = term.trim().toLowerCase();

    if (!term) {
        return string;
    }

    const stringLower = string.toLowerCase();
    let result = '';
    let termIndex = 0;

    for (var i = 0; i < string.length; i++) {
        if (term[termIndex] !== undefined && stringLower[i] == term[termIndex]) {
            result += '<mark>' + string[i] + '</mark>';
            termIndex++;
        } else {
            result += string[i];
        }
    }

    return termIndex == term.length ? result : string;
}

 

Image%202019-09-28%20at%2011.31.36%20AM.

  • Like 1

Share this post


Link to post
Share on other sites
import re


replace = lambda char, s_: re.sub(r"({})".format(char), r"<mark>\1</mark>", s_, 1)


def find(what, where):
    what = list(set(what))
    for char in what:
        where = replace(char, where)
    return where


#test
for x in ["Gecko", "Teodor", "Benone"]:
    print(find("eo",x))

print(find("ec", "abecedar"))

 

rezultate:

 

G<mark>e</mark>ck<mark>o</mark>
T<mark>e</mark><mark>o</mark>dor
B<mark>e</mark>n<mark>o</mark>ne
ab<mark>e</mark><mark>c</mark>edar

 

 

 

Edited by AdIntended

Share this post


Link to post
Share on other sites
2 hours ago, Sim Master said:

Nu-i rau, dar cred ca e loc de mai bine.


function sm_v2(string, term) {
    term = term.trim().toLowerCase();

    if (!term) {
        return string;
    }

    const stringLower = string.toLowerCase();
    let result = '';
    let termIndex = 0;

    for (var i = 0; i < string.length; i++) {
        if (term[termIndex] !== undefined && stringLower[i] == term[termIndex]) {
            result += '<mark>' + string[i] + '</mark>';
            termIndex++;
        } else {
            result += string[i];
        }
    }

    return termIndex == term.length ? result : string;
}

 

Excelent!

 

@AdIntended Codul tau la cautarea "ce" in abecedar, da mark pe ab *e* *c* edar in loc de abe *c* *e* dar.

Share this post


Link to post
Share on other sites
import re
from collections import OrderedDict


def replace(char, s_, pos):
    last_index = s_.find(char)
    if pos <= 0:
        return re.sub(r"({})".format(char), r"<mark>\1</mark>", s_[:last_index + 1], 1), last_index
    return re.sub(r"({})".format(char), r"<mark>\1</mark>", s_[pos + 1:], 1), last_index


def find(what, where):
    what = "".join(OrderedDict.fromkeys(what.lower()))
    result: str = ""
    last_index = 0
    for char in what.lower():
        temp, last_index = replace(char, where, last_index)
        result += temp
    return result


for x in ["Gecko", "Teodor", "Benone"]:
    print(find("eo", x))

print(find("ce", "abecedar"))

rezultate:

G<mark>e</mark>ck<mark>o</mark>
T<mark>e</mark><mark>o</mark>dor
B<mark>e</mark>n<mark>o</mark>ne
abe<mark>c</mark><mark>e</mark>dar

 

acum respecta ordinea

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...